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The wrong message on vaccines 


Unfounded fears about vaccines are already reaching worrisome proportions. No public figure 
should stoke them — as US presidential hopeful Michele Bachmann has done. 


proportion of children aged five to six who were properly vacci- 
nated against measles before they entered school had reached the 
desired 95%. In parts of the country, the rate of refusal of mandatory 
childhood vaccinations for non-medical reasons stands at 25%. And 
as-yet-unpublished data show that this rate in continuing to increase. 
The results of vaccine refusal are already evident in Europe. France 
reported 4,937 cases of measles in the first three months of this year 
— nearly as many as in all of 2010. In total, 30 countries in the World 
Health Organization's European region reported a marked increase in 
measles cases early this year. At some point, the herd immunity that 
protects the unvaccinated and the immunosuppressed could be lost. 
Against this backdrop, it is vital that public debates on vaccination 
stick to the facts — and that politicians who make science-supported 
decisions be applauded. Unfortunately, it was Michele Bachmann who 
received the applause at the Republican presidential candidates’ debate 
earlier this month. The Minnesota congresswoman had attacked rival 
candidate Rick Perry for his failed attempt in 2007, as Texas governor, 
to mandate vaccination against human papilloma virus (HPV) for 11- 
and 12-year-old schoolgirls, as recommended by the US Centers for 
Disease Control and Prevention (CDC) in Atlanta, Georgia. 
Perhaps Perry did the right thing for the wrong reasons: he has close 


E 2009 and 2010, fewer than half of all US states reported that the 


ties to pharmaceutical company Merck, a generous donor to his cam- 
paigns and the only maker ofan HPV vaccine at the time of his attempt. 
But his goal was laudable: HPV is the most common sexually trans- 
mitted infection in the country and the major cause of cervical cancer, 
which kills 4,000 US women each year. The Food and Drug Adminis- 
tration has also approved the HPV jab for the prevention of vulvar and 
vaginal cancers, and of anal cancer in both males and females. 

That did not stop Bachmann from making the astonishingly irre- 
sponsible claim, on national television, that the vaccine is a “poten- 
tially dangerous drug”. She later suggested that it is linked to “mental 
retardation” Yet the CDC says that the vaccine is safe. Some 35 mil- 
lion doses have been delivered in the United States since its approval, 
but just 0.05% of recipients have reported side effects, mostly minor. 
Nor is there scientific support for the belief that presumably drives 
Bachmann's misstatements — that vaccinating prepubescent girls will 
somehow encourage them to become sexually active. 

If Bachmann wants to do right by the millions of girls she claims to 
care about, she ought to retract her words and urge HPV vaccination. 
That might do more than anything else to combat an increasingly 
common parental mindset that takes for granted the past century of 
gains against infectious disease, and in so doing threatens to reverse 
them. = 


Beyond the bomb 


Twenty years after the end of the cold war 
scientists and the military stillneed each other. 


ith a science and technology budget that currently stands 
W: about US$12 billion per year, the US defence complex is 

the world’s largest investor in military research. Much of the 
money has gone into developing weapons of unprecedented lethality, 
but a large fraction supports ‘dual-use’ research, whose products — from 
the Internet to the Global Positioning System — have enriched society 
as a whole. And the trove of military data has proved surprisingly useful 
to scientists studying environmental change (see page 388). 

Military efforts are also helping to improve public health. Studies 
of traumatic brain injuries inflicted by bomb blasts (see page 390) 
could aid in the diagnosis and treatment of brain injuries in civilians. 
And the need to keep troops healthy has resulted in advances ranging 
froma partially effective vaccine against HIV to a mobile-phone-based 
reporting system for disease cases (see page 395). 

Such programmes have been strengthened by JASON, an independ- 
ent panel of high-level scientists whose advice is often brutally frank 


(see page 397). But the Pentagon can and should do much more to 
support dual-use science — by, for example, minimizing the bureau- 
cracy and secrecy that still make it far too difficult for outsiders to gain 
access to military data. 

Defence officials should also insist that their public-health research 
be meticulously transparent about goals and methods — this is cru- 
cial to overcoming mistrust in the developing world. At home, the 
Pentagon could enhance its credibility among academics by funding 
discussions on the ethical, legal and social implications ofits research 
— for example, the development of robotic warfare (see page 399). 

Most fundamentally, Congress and the Pentagon should continue 
their strong support for military science. This is not as axiomatic as it 
was when the United States was in a decades-long, high-stakes techno- 
logical race with the Soviet Union. Much of today’s military research, 
in the United States and elsewhere, consists of shorter-term problem- 
solving, such as how to deal with low-tech roadside explosives, or the 
development of virtual worlds for training troops or aiding their post- 
injury recovery (see page 406). As the mission becomes more diffuse, 
high-level support for military science may wane, especially as the Pen- 
tagonss overall funding comes under scrutiny (see page 386). Yet cut- 
ting and narrowing military research would be short-sighted, especially 
when the concept of national security is itself expanding, to include not 
just military strength, but public health, economic vigour, dealing with 
climate change, and all the other factors that make for a strong society. = 
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heralded as a superstar. Almost single-handedly, the country has 

halted long-term forest loss across Asia, and even turned it into 
a net gain. Since the 1990s, China has planted more than 4 million 
hectares of new forest each year. 

Earlier this month, President Hu Jintao pledged that China would 
do even more. He told a meeting of the Asia-Pacific Economic Coop- 
eration Forum in Beijing that the nation would increase its total area 
of forest by 40 million hectares over the next decade. China, he said, is 
ready to make new contributions to green, sustainable growth. 

It sounds impressive, but we risk failing to see the wood for the trees. 
In China, ‘forest’ includes uncut primary forest, regenerating natural 
forest and monoculture plantations of non-native trees. The last of 
these accounts for most of the ‘improvement in 
forest cover. 

The State Forestry Administration has claimed 
that total forest cover in China reached 20.36% in 
2008. Most of this results from the increase in tree 
crops such as fruit trees, rubber and eucalyptus, 
not recovery of natural forest, yet Chinese data do 
not record this shift. The change threatens eco- 
system services, particularly watershed protec- 
tion and biodiversity conservation. 

Exotic tree species are being planted in arid and 
semi-arid conditions, where perennial grasses 
with their extensive root systems would be better 
protectors of topsoil. Plantation monocultures 
harbour little diversity; they provide almost no 
habitat for the country’s many threatened forest 
species. Plantations generate less leaf litter and 
other organic inputs than native forests, so soil 
fauna and flora decrease, and groundwater deple- 
tion can be exacerbated by deep-rooted non-native trees that use more 
water than native species. Afforestation in water-stressed regions might 
provide wind-breaks, and tree plantations offer some carbon storage. 
But these benefits come at a high cost to other ecological functions. 

Why the intense focus on forest cover? China has long promoted 
the planting of tree crops. Since 1999, the Grain for Green programme 
has resulted in some 22 million hectares of new trees on sloping farm- 
land. The programme began after the 1998 Yangtze River floods, 
which the government blamed on loss of tree cover, although reduc- 
tions in riparian buffers and soil infiltration capacity probably also 
had a major role. 

Since 2008, forest tenure reform has encouraged the privatiza- 
tion of former collective forests, with more 


1 the United Nations’ 2011 International Year of Forests, China is 


than 100 million hectares affected. Privatiza- NATURE.COM 
tion can benefit local economies. But in the _ Discuss this article 
absence of any management framework, it has __ online at: 

also promoted conversion of natural forests _go.ature.comi/yyjiso 
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China’s new forests aren’t 
as green as they seem 


Impressive reports of increased forest cover mask a focus on non-native tree 
crops that could damage the ecosystem, says Jianchu Xu. 


into plantations: smallholders often fell natural forests for immediate 
income, then plant monoculture tree crops for long-term investment. 

Although the Chinese government has shown that it understands 
environmental fragility, its scientific and policy guidelines do not ade- 
quately address the country’s diversity of landscapes and ecosystems. 
I have seen massive tree plantations on the Tibetan Plateau, in areas 
where forests never grew before. Local governments face the need 
to respond to the national imperative for increased forest cover by 
planting fast-growing species, while also generating the biggest local 
economic benefits possible. This explains why unsuitable species such 
as aspens are planted in north China, whereas eucalyptus and rubber 
trees proliferate in the south. 

Perhaps the International Year of Forests can help decision-makers 
to focus on the various meanings of ‘forest, and 
the trade-offs each type entails. Natural recovery 
is still the best way to restore damaged forests, 
but restoration requires targeted involvement 
using the best science. 

Afforestation can restore ecosystem function 
only if the right species are planted in the right 
place. Further studies are needed on how the mix 
of species affects ecosystem functions. Sloping 
lands, for example, benefit from perennial root 
systems and associated soil microfauna, but trees 
are not the only, or necessarily the best, way to 
establish these root systems. 

China's forestry mandate should focus on 
enhancing environmental services, but policy- 
makers cannot ignore rural livelihoods. Technical 
know-how should be provided to local foresters 
and farmers. Doing away with narrow, one-size- 
fits-all management targets would also help. The 
country, with its state-managed market economy, can afford direct pay- 
ments for forest ecosystem services, but they should only be offered for 
natural or regenerated forests with proven biological or ecological value. 

As an ecologist and agroforestry practitioner, I would like to see 
China establish parallel forest-management programmes for recovery 
and restoration of natural forests, and for incorporating working trees 
into farmlands. Each should include best practices from ecosystem sci- 
ence; a clear definition of tree crop plantations for timber or non-timber 
products would clarify the separate systems. A dual strategy would 
require increased collaboration throughout China's land-management 
ministries, well supported by interdisciplinary research. But it could 
ensure that China's massive investment in forests provides maximum 
benefits, to both local livelihood and the environment. = 


Jianchu Xu is a senior scientist at the World Agroforestry Centre and 
a professor at the Kunming Institute of Botany, Chinese Academy of 
Sciences. e-mail: J.C. Xu@cgiar.org 
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POLICY 


Deepwater report 
The full catalogue of failures 
that led to the destruction 

of the Deepwater Horizon 
drilling rig and the subsequent 
oil spill in the Gulf of Mexico 
last year have been laid out 
in the final US government 
report, released on 

14 September. Investigators 
from the Bureau of Ocean 
Energy Management, 
Regulation and Enforcement 
and the US Coast Guard 
conclude that rig-owners 
Transocean, contracting 
company Halliburton and 
“designated operator” BP all 
violated a number of federal 
regulations. See go.nature. 
com/gxp9i4 for more. 


TB in Europe 

The World Health 
Organization (WHO) has 
launched a programme to slash 
soaring rates of drug-resistant 
tuberculosis (TB) in Europe. 
Just one-third of an estimated 
80,000 drug-resistant TB 
infections are reported to the 
European Centre for Disease 
Prevention and Control and 
the WHO in Europe each year. 
The WHO programme aims to 
raise this to 85% by 2015, and 
to treat three-quarters of those 
cases. The US$5 billion needed 
will come from industry, 
non-profit organizations and 
the 53 member states in the 
WHO European region. See 
go.nature.com/rezftm for 
more. 


Israel joins CERN 


Israel has become the first 
non-European country to join 
CERN, Europe’s high-energy 
physics research centre near 
Geneva, Switzerland. The 
country, which is the centre's 
21st member, will get voting 
rights on CERN’s council 
and will have to contribute 

to the centre’s budget. Its 
membership was officially 


Hope for space telescope 


A US Senate subcommittee voted on 14 September to continue 
funding the James Webb Space Telescope (pictured), the 
successor to the Hubble Space Telescope. Its climbing price 

tag, now estimated at US$8.7 billion, is devouring NASAs 
astrophysics budget, and a subcommittee in the House of 
Representatives had voted to cancel the project. The Senate 
subcommittee, led by Barbara Mikulski (Democrat, Maryland), 
wants the telescope to get $530 million in 2012 — much more 
than the $374 million in the president's budget request. See 


go.nature.com/ei3ije for more. 


confirmed on 16 September, 
after Israel’s cabinet voted 

to join the lab in April (see 
Nature 472, 265; 2011). 


Stem-cell lawsuit 
Two researchers seeking to 
block US government funding 
of research using human 
embryonic stem cells are still 
battling to continue their 
lawsuit. On 19 September, 
James Sherley and Theresa 
Deisher appealed against a 
27 July decision in which a 
federal judge ruled against 
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their case (Nature 476, 14-15; 
2011). See go.nature.com/ 
dkr1wj for more. 


Biofuels error 


Countries that encourage 
increased use of energy 

from biomass may be 
overestimating the savings in 
greenhouse-gas emissions. In 
an Opinion report published 
on 15 September, the 
19-strong scientific committee 
of the European Environment 
Agency has criticized what 

it calls a “serious accounting 


error’ if policies incorrectly 
assume that the combustion 
of biomass is carbon-neutral. 
That is not always the case: the 
biomass may replace forest 
that would otherwise store 
carbon, or replace food crops 
that must then be planted and 
harvested on other land that 
was once forest. Emissions 
resulting from such changes 
in land use are not correctly 
accounted for in, for example, 
the European Union's 
Renewable Energy Directive, 
the committee said. 


Global Fund review 
The Global Fund to Fight 
AIDS, Tuberculosis and 
Malaria needs a major 
overhaul, according to a report 
released on 19 September. 

The fund, based in 

Geneva, Switzerland, had 
commissioned the review after 
it discovered corruption and 
fraud affecting some US$39 
million of its grants (see Nature 
470, 6; 2011). The report says 
that the fund should shift its 
focus away from getting money 
out as quickly as possible: the 
group needs better auditing 
and management of grants, 
and should measure its 

success in terms of its impact 
on health, not how much it 
spends. The fund welcomed 
the report. See go.nature.com/ 
w87mnc for more. 


| _ERESEARCH 
NASA's mega-rocket 


NASA revealed its latest 
designs for a heavy-launch 
vehicle on 14 September. The 
Space Launch System is set to 
be more than 10 metres taller 
than the Saturn V launcher, and 
would be the most powerful 
rocket ever to lift people into 
space, with configurations 

for both 70 and 130 tonnes 

of thrust. NASA officials say 
that they're aiming for a late- 
2017 crewless test flight, in 
advance of a 2021 manned test. 


C. GUNN/NASA 


N. CHITRAKAR/REUTERS 


SOURCE: UN 


The programme would cost 
US$3 billion a year to get to 
the test launch — less than 
the agency spent annually 
on the shuttle programme. 
See go.nature.com/3393u5 
for more. 


Body onachip 

Even though US politicians 
have not yet authorized its 
creation, a proposed new 
translational-medicine centre 
at the National Institutes 

of Health (NIH) is already 
getting busy. On 16 September, 
the NIH and the US military 
together proposed a US$140- 
million five-year effort to 
develop a chip inlaid with 
human cells for testing new 
drugs. The NIH’s share of this 
effort would be administered 
through the proposed new 
National Center for Advancing 
Translational Sciences. The 
centre is also advertising 

for a director. See go.nature. 
com/9ahvkb for more. 


Radio-array rivals 
Australia and South Africa 
submitted their final bids to 
host the Square Kilometre 
Array of radio telescopes on 
15 September. Australia is 
proposing to build the array 
in the mostly empty interior 
of Western Australia, with 
outstations as far away as New 
Zealand. South Africa would 
build its version in the Karoo 
Desert, with parts extending 
into eight neighbouring 


TREND WATCH 


Insufficient progress is being 
made to drive down global infant 
mortality, a report from the United 
Nations concludes. Developed 
regions are rated as ‘on track’ to 
meet their 2015 target of 5 deaths 
in under-fives per 1,000 live births. 
But although death rates have 
dropped rapidly in developing 
nations, the fall is not enough to 
achieve a 32 per 1,000 target by 
the same date, warns the UN’s 
child mortality estimation group. 
In 2010, developing nations had 
an average of 63 deaths per 1,000, 
with sub-Saharan Africa on 121. 


countries, including islands 

in the Indian Ocean. A 

group of external experts will 
scrutinize the bids, with a 
decision by the project's board 
of directors expected early 
next year. Construction of the 
€1.5-billion (US$2.1-billion) 
array is set to begin in 2016. 


EVENTS 


aires alee 


Mountain quake 

A 6.9-magnitude earthquake 
struck the sparsely populated 
Himalayan state of Sikkim, 
northeastern India, on 

18 September. As Nature went 
to press, the quake had killed 
81 people, including some in 
nearby West Bengal, Bihar, 
Nepal and Tibet, and rain and 
mudslides were hampering 
rescue efforts (pictured). 


| BUSINESS 
Nuclear exit 


German engineering firm 
Siemens is pulling out of 
nuclear power for good. In an 
interview with Der Spiegel on 
18 September, chief executive 


Peter Loscher announced that 
the Munich-based company 
would no longer build or 
finance nuclear power plants 
in Germany or elsewhere. 
Léscher said that the decision 
was largely due to the nuclear 
accident at Fukushima Daiichi 
in Japan and the German 
government's decision to shut 
down its existing nuclear 
plants by 2022. See go.nature. 
com/zay8ra for more. 


Innovation boss 


Alexander von Gabain took 
over as head of the European 
Union's European Institute of 
Innovation and Technology 
(EIT) on 15 September, 
replacing founding chairman 
Martin Schuurmans. Von 
Gabain, a microbiologist at 
the Max Perutz Laboratories 
in Vienna and a co-founder 
of the Austrian biotechnology 
company Intercell, plans to 
get the EIT more involved in 
biomedical science. 


Schon loses PhD 


The University of Konstanz 
in Germany was correct to 
withdraw the doctoral degree 
of disgraced physicist Jan 
Hendrik Sch6n, a court in the 
state of Baden-Wiirttemberg 
has ruled. Schon, formerly 

a staff physicist at Bell 
Laboratories in Murray Hill, 
New Jersey — the development 
arm of Lucent Technologies 


SLOW PROGRESS ON INFANT DEATHS 


The developing world is likely to miss its 2015 Millennium 


Development Goal on infant mortality. 
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SEVEN DAYS | THIS WEEK | 


26-30 SEPTEMBER 
The World Conference 
on Marine Biodiversity 
takes place in Aberdeen, 
UK, discussing research 
priorities. 
www.marine-biodiversity.org 


27-30 SEPTEMBER 
The International 
Council for Science 
holds its triennial 
general assembly in 
Rome. The agenda 
includes reshaping 
global programmes for 
environmental research. 
go.nature.com/egukv6 


30 SEPTEMBER 

The Tevatron, Fermilab’s 
particle accelerator in 
Batavia, Illinois, shuts 
down. See page 379 

for more. 


— is notorious for a string of 
high-profile fabrications in the 
fields of organic and molecular 
electronics. Last year, Schén 
successfully sued his alma 
mater for its 2004 decision to 
revoke his 1997 PhD thesis (for 
which there is no suspicion 

of data fabrication). But in 

a 14 September judgement, 

the state court agreed with 
Konstanz that later misconduct 
also showed ‘unworthiness’ to 
hold the doctorate. The court 
added that the judgement 
could not be appealed. 


Global-health head 


Trevor Mundel, currently 
head of development at Swiss 
drug firm Novartis, will be the 
new president of the Global 
Health Programme at the Bill 
& Melinda Gates Foundation 
in Seattle, Washington, the 
foundation announced on 

13 September. Mundel, 
taking over from 1 December, 
replaces Tachi Yamada, who 
retired in June after five years 
as president. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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NEWSIN FOCUS 


Researchers bid a fond Rash of results see Alook 
farewell to the top US particle astronomers close in on Disgraced Austrian inside the military- 
collider p.379 exo- Earths p.383 doctor gets job back p.384 science complex p.386 


| R RAI 


DN Dlr 
Texas governer Rick Perry has received a stem-cell treatment deemed illegal in the United States. 


Texas prepares to 


fight for stem cells 


Enthusiasm for unapproved treatments worries regulators. 


BY DAVID CYRANOSKI 


here’s a showdown brewing in the state 
ik Texas — and it could get ugly. On 

one side stands the US Food and Drug 
Administration (FDA), which is clamping 
down on the proliferation of unapproved stem- 
cell treatments being offered to Americans. 
On the other is state governor Rick Perry, who 
is riding high in the polls as the Republican 


party’s favoured candidate for the 2012 presi- 
dential elections — and a staunch advocate of 
the stem-cell treatments. 

At least a dozen companies in the United 
States offer the treatments, which involve 
extracting adult stem cells from a patient’s 
tissue, culturing them, then reinjecting the 
cells. The theory is that the cells will flourish 
and replace diseased or damaged tissue in a 
range of conditions from spinal-cord injury to 


Alzheimer’s disease and diabetes. 

But no treatment that involves anything 
more than “minimal manipulation” of adult 
stem cells outside the body has been approved 
by the FDA. Although bone-marrow trans- 
plantations, for example, involve extraction 
and reinjection of haematopoietic stem cells, 
those cells are not cultured or significantly 
processed. 

“Any procedure involving removing cells 
from the body and manipulating them — 
even if it’s something as simple as centrifuging 
them or putting them in a plastic tube — and 
then putting them elsewhere in the body 
poses risks,” says Paul Knoepfler, a stem-cell 
specialist at the Institute for Regenerative 
Cures at the University of California, Davis. 
No clinical trials have shown any evidence of 
efficacy, he says. “Patient testimonials cited by 
the people selling the treatments have little if 
any meaning,” 

Depending on exactly how the cells are 
processed and administered, many of these 
procedures are illegal in the United States. But 
that didn't stop Perry from being injected with 
a concoction of his own stem cells in July to 
treat a back complaint. Perry’s procedure was 
carried out by Stanley Jones, a surgeon based 
in Houston, Texas, who specializes in cosmetic 
procedures and who isa friend of the governor. 

The previous month, Perry had supported 
legislation that authorized Texas's health com- 
mission to create a stem-cell bank in which 
patients would be able to deposit their adult 
stem cells for future use. 


PUBLIC INTEREST 

Texas has poured millions of dollars into 
studying and commercializing adult stem-cell 
treatments through its Emerging Technology 
Fund, an initiative created at Perry’s behest. 
Perry sees the treatments as both a potential 
boon to the Texan economy and an alterna- 
tive to treatments that use embryonic stem 
cells, which he opposes. In a 25 July letter, he 
asked the Texas Medical Board (TMB), which 
regulates the state’s physicians and is currently 
reviewing its policy on stem-cell treatments, to 
take a lenient view on the procedures. “It is my 
hope that Texas will become the world’s leader 
in the research and use of adult stem cells,” he 
wrote. “With the right policies in place, we can 
lead the nation in advancing adult-stem-cell 
research that will treat diseases, cure cancers 
and, ultimately, save lives” 
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> Although scientists and physicians in Texas 
are excited about the funding to develop stem- 
cell science, many are concerned that treat- 
ments will reach the clinic before safety and 
efficacy are properly established. “I do believe 
governor Perry is pushing research to the 
clinic too quickly,’ says Kirstin Matthews, who 
researches science and technology policy at 
Rice University in Houston, and who is amem- 
ber of the TMB’s stakeholder group, which is 
helping to draft the board’s stem-cell policy. 

“People should know what they are doing,” 
adds Mari Robinson, executive director of the 
TMB. “Otherwise 
they'll try to use it for 
everything from get- 
ting rid of wrinkles to 
curing cancer.” 

The TMB’s draft 
stem-cell policy is 
open for public con- 
sultation, and will be 
finalized in Novem- 
ber. In a series of 
written submissions, 


scientists have pushed “feel that we 

for more patient pro- mus t protect 
tection. “Asa biomed- patients fi rom 
ical researcher, I feel _Tisky treatments 
the extremes ofregu- advanced by 
latory burden every overzealous, 

day, but I also feel even greedy, 

that we must protect entrepreneurs.” 
patients from risky Bettie Sue Masters 


treatments advanced 

by overzealous, even greedy, entrepreneurs,’ 
wrote Bettie Sue Masters, a biochemist at the 
University of Texas Health Science Center in 
San Antonio. 


APPROPRIATE OVERSIGHT 

Mary Ellen Weber, vice-president in charge 
of government affairs and policy at the 
University of Texas Southwestern Medical 
Center in Dallas, hopes that patients will 
be properly informed that any benefits that 
they experience may not be attributable to 
the stem-cell treatment, and may not be long- 
lived. By contrast, she warned in a letter to 
the TMB, the “risks conferred by stem-cell 
therapy may be delayed and permanent”. 
Consequently, any adult stem-cell proce- 
dures should be looked at by an institutional 
review board, she wrote. 

But state representative Rick Hardcastle, 
who introduced the legislation to author- 
ize the Texas stem-cell bank, has questioned 
the need for institutional-review-board con- 
sideration of procedures that use adult stem 
cells. “It was not, and is not, my intent to cre- 
ate onerous and unnecessary regulations to 
impede the practice and research of physicians 
in regards to the use of investigational agents,” 
he wrote to the TMB on 23 August. 

Although the TMB’s forthcoming policy is 
meant to provide clearer guidance on the use 


of adult stem cells in the state, physicians and 
companies are still subject to FDA regulations. 
And there are growing signs that Perry’s ambi- 
tion is on a collision course with recent efforts 
by the FDA to flex its regulatory muscle. 

For many years, stem-cell clinics have been 
able to flourish by skirting the FDA regula- 
tions. Some clinics recruit patients in the 
United States and then send them overseas for 
treatment: the Stem Cell Treatment Institute 
in San Diego, for example, treats its patients in 
Mexico. Others invoke a ‘compassionate use’ 
exemption to FDA regulations, which allows 
them to charge patients for experimental 
therapies if no other treatment options are 
available. Some argue that the FDA has no 
jurisdiction over their activities, claiming 
that adult stem cells are not drugs — merely 
the patient’s own tissue — and therefore not 
subject to FDA oversight. 

“The growth in the number of clinics and 
companies marketing stem-cell products 
without approval is explosive,’ says Doug Sipp, 
who studies global stem-cell regulation at the 
RIKEN Center for Developmental Biology in 
Kobe, Japan. “The United States is becoming 
one of the most rapidly expanding markets for 
unregulated stem-cell applications.” 

The FDA has long pursued a policy of trying 
to get companies to comply with the regula- 
tions, rather than prosecuting them. Recently, 
however, it has taken stronger steps to crack 
down. On 15 August, for example, the agency 
sent a warning letter to Chuck Naparalla, chief 
executive of TCA Cellular Therapy in Cov- 
ington, Louisiana, saying that the company 
had failed to meet safety standards in some 
of its five FDA-approved clinical trials of its 
stem-cell therapies. The FDA also accused it 
of selling treatments to patients “outside of 
clinical protocols” TCA Cellular Therapy has 
not responded to Nature’s repeated requests 
for information on its efforts to comply with 
the FDA’s demands. 

On 18 August, after an investigation by the 
FDA and the FBI, Fredda Branyon, former 
owner of Global Laboratories in Scottsdale, 
Arizona, was convicted by the US Attorney's 
office in the Southern District of Texas court 
of selling unauthorized stem-cell products 
across state lines. 

And in a court case that began last year, 
the FDA is demanding that Regenerative Sci- 
ences of Colorado stop selling its adult stem- 
cell product Regenexx (see Nature 466, 909; 
2010). Christopher Centeno, medical direc- 
tor of Regenerative Sciences, claims that “the 
Regenexx procedure is the practice of medi- 
cine, something Congress and the courts have 
expressly prohibited the FDA from regulat- 
ing”. The FDA argues that Regenexx falls 
under its jurisdiction because its is classed 
as a biological drug under the Code of Fed- 
eral Regulations 21 on human cells and tis- 
sues. That regulation allows the reinjection of 
a patient’s adult stem cells if they have been 
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“minimally manipulated’, but the FDA says 
that the culturing involved in Regenexx is not 
minimal because “the cells are grown, pro- 
cessed and mixed with drug products outside 
the body”. 

While the FDA is busy in court, Texas’s 
enthusiasm for stem-cell treatment is grow- 
ing fast. 

The cells used to treat Perry were cultivated 
at a recently opened laboratory in Houston that 
is jointly run by RNL Bio, a stem-cell company 
headquartered in Seoul, and Celltex Therapeu- 
tics, a company established and run by Jones 
and David Eller, former chairman of the board 
of Texas A&M University and now a supporter 
and adviser to Perry. Neither Perry nor Jones 
responded to Nature’s interview requests. 

RNL Bios affiliated clinics in the United States 
take fat samples from patients and send the cells 
to Seoul for processing. The manipulated cells 
are not approved for reinjection in the United 
States or South Korea, so patients typically travel 
to China or Japan for the procedure. RNL Bio 
is now being investigated by the Korean gov- 
ernment after two people who underwent this 
procedure in Japan died (see Nature 468, 485; 
2010). Jones himself travelled to Japan last year 
to receive an injection 
of cells, prepared by 
RNL Bio, to treat his 
arthritis — success- 
fully, he claims. 

Perry’s procedure 
was reportedly car- 
ried out in the United 
States, at Jones's clinic. 
The FDA declined to 
discuss any ongoing 
or future investiga- 


“I do believe tions into stem-cell 
governor Perry clinics, but a former 
is pushing reviewer at the FDA’s 
research to Center for Biolog- 
the clinic too ics Evaluation and 


Research, which reg- 
ulates the clinical use 
of stem cells, says: “If 
Perry was treated in the United States, it was 
clearly in violation” of FDA regulations. 

Jones has said that Perry may be prepared to 
stand up to the FDA over the issue. Ina letter to 
the TMB, Jones wrote: “Please don’t make this 
difficult, as Governor Perry has really gone all 
out personally to make stem cells available to 
people in need of them in Texas.” 

“He is incidentally not against a challenge 
from any government agency that wants to 
impede us in Texas,’ Jones added. 

Jones is now set to treat Hardcastle, who has 
multiple sclerosis. The former FDA reviewer, 
however, says that after the publicity over Per- 
ry’s procedure, it would be standard practice 
for the agency to warn Jones against carrying 
out further injections. “If you do it a second 
time, you could be in hot water.” 

The showdown could be about to begin. = 


quickly.” 
Kirstin Matthews 
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Fermilab faces life 
after the Tevatron 


As collider shuts down, US particle physicists shift focus. 


BY EUGENIE SAMUEL REICH 


giant particle accelerator known as the 
Tevatron is down to its final laps. 

Shortly after 2 p.m. on 30 September, with 
reporters watching by video link from a nearby 
auditorium, an operator at the Fermi National 
Accelerator Laboratory (Fermilab) in Batavia, 
Illinois, will divert the final bunches of protons 
and antiprotons speeding around the Teva- 
tron’s 6.3-kilometre ring, sending them barrel- 
ling into a solid metal block. The experiment 
that ruled high-energy physics for more than 
25 years will then be over, its funding expired. 

The shutdown will provide an occasion 
to dwell on the Tevatron’s past successes (see 
‘Smashing success’), but it also marks Fermilab’s 
transition to smaller and lower-profile experi- 
ments that explore different kinds of physics. 
Champagne corks are not expected to pop 
following the closure. “We're not that happy,’ 
says Roger Dixon, head of the lab’s accelerator 
division. “It’s a solemn occasion” 

The closure is a consequence of tight US 
physics funding and the advent of the Large 
Hadron Collider (LHC) at CERN, Europe’s 
high-energy physics lab near Geneva, Swit- 
zerland. The LHC broke the Tevatron’s record 
for collision energy in 2009 and has been run- 
ning steadily since 2010. In 2008, a US Depart- 
ment of Energy advisory panel recommended 
that the country switch its focus in accelerator 
physics from the ‘energy frontier’ that would 
be dominated by the LHC to the ‘intensity 
frontier’ in which researchers aim to increase 
the number of particles produced per second. 
Instead of creating previously unknown par- 
ticles by pushing collisions to higher energies, 
the new strategy will be to investigate rare 


> 


MORE 
ONLINE 


L an old and celebrated race track, the 


interactions involving known particles. The 
lab is already running three experiments to 
create and study neutrinos — nearly massless 
particles that interact only weakly with ordi- 
nary matter — using protons from accelera- 
tors that also feed the Tevatron. By modifying 
storage rings and accelerators that will become 
available when the Tevatron closes, Fermilab 
should be able to bring two more neutrino 
experiments online by 2014, followed by 
two experiments on muons, heavier cousins 
of the electron. The lab is also working on a 
proposal for an experiment called Project X, 
which would increase the intensity of the accel- 
erator beams from 750 kilowatts to more than 
2 megawatts, and would allow detailed com- 
parisons of the properties of matter and anti- 
matter. The project has yet to receive approval 
from the Department of Energy. 

For US-based particle physicists, the change 
is likely to mean smaller research groups — 
perhaps 100 or 200 per experiment, rather 
than 600 for each of the Tevatron’s two detec- 
tor collaborations. “It’s a big shift,” says Regina 
Rameika, project manager of one of the neu- 
trino experiments now under construction. 

As for the Tevatron, its detectors and mag- 
nets are ultimately destined for the waste dump, 
but immediate plans are to run tours of the 
soon-to-be-accessible beam cavities and detec- 
tors for some of the roughly 15,000 school chil- 
dren who visit Fermilab each year. Meanwhile, 
scientists working at the Tevatron will con- 
tinue to analyse the data that it has produced 
— looking for signals of the Higgs particle that 
is thought to endow all others with mass, or for 
evidence that will narrow the probable range of 
its mass. “We expect we will be adding to our 
legacy,’ says Dmitri Denisov, spokesman for the 
Tevatron’s DZero collaboration. m 
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SMASHING SUCCESS 


Discoveries at the Tevatron particle 
collider (pictured, below) confirmed the 
standard model of physics and set the 
course for new investigations beyond it. 


3 JULY 1983 


Protons accelerated to a record 
512 gigaelectronvolts. 


13 OCTOBER 1985 


Proton-antiproton collisions seen for 
the first time, at 1.6 teraelectronvolts. 


3 MARCH 1995 


The top quark discovered, the last 
fundamental constituent of matter. Its 
energy signatures are pictured, bottom. 


18 NOVEMBER 1996 


Observation of antihydrogen, the first 
antimatter atoms. 


5 MARCH 1998 


The B, meson, the last undiscovered 
quark-—antiquark pair, is spotted. 


1 MARCH 2001 


Higher-energy phase of operation begins. 


25 SEPTEMBER 2006 


Discovery that B, mesons transform into 
their own antiparticles spontaneously. 


23 OCTOBER 2006 


Discovery of the 2*, baryons, which 
include a bottom quark. 


1 JUNE 2007 


Discovery of the =°, baryon, which 
includes a bottom and a strange quark. 


4 AUGUST 2008 


Experimental evidence restricts the 
possible mass range of the Higgs boson. 
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Soldiers are yet to see any effective new countermeasures against bioterror agents. 


Pentagon rethinks 
bioterror effort 


Critics say US$1.5-billion initiative has not delivered results. 


BY ERIKA CHECK HAYDEN 


months for scientists to make a vaccine 

against a deadly virus. Yet a real US military 
programme that aimed to do just that is being 
dismantled after five years of trying. 

The Transformational Medical Tech- 
nologies (TMT) initiative, born in the US 
Department of Defense in 2006, was origi- 
nally conceived as a five-year, US$1.5-billion 
project that would substantially accelerate the 
development of countermeasures to protect 
soldiers against biological attacks. Made into 
a permanent programme in 2009, it set out 
to sequence the genomes of potential bioter- 
ror agents, explore new drug technologies 
and develop ‘broad-spectrum therapies that 
would work against multiple bacterial and 
viral pathogens — especially haemorrhagic 
fever viruses such as Ebola and Marburg. 
Supporters of the programme point out that 
three candidate drugs developed under the 
programme, for pathogens including Ebola 
virus, are now in clinical trials. 


lE the film Contagion, it takes just a few 


The TMT programme, however, has ceased 
to exist as a stand-alone effort. Alan Rudolph, 
director of Chemical and Biological Technolo- 
gies for the TMT’s parent office, the Defense 
Threat Reduction Agency, is folding some 
TMT projects into other Pentagon efforts and 
reordering their priorities. Critics say that it has 
failed in its underlying objective to provide a 
faster, game-changing approach to biodefence. 
No antibiotics developed by the TMT have 
entered clinical trials. The drug candidates it 
has developed are designed for single patho- 
gens, not multiple threats. And although the 
programme is set to award a major clinical-trial 
contract later this year, the drug being tested 
would treat not exotic, untreatable pathogens 
but ordinary influenza, a disease already heav- 
ily researched outside the Pentagon. 

Michael Osterholm of the University of Min- 
nesota's Center for Infectious Disease Research 


BEYOND THE BOMB 


Science and the military 
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and Policy thinks that the programme was 
overambitious and ill-conceived. “They're 
wasting tonnes of money,’ he says. 

The programme’s architects have vig- 
orously defended its record. “There is a 
success there that we need to build on,’ says 
Jean Reed, who, as deputy assistant to the sec- 
retary of defence for chemical and biological 
defence and chemical demilitarization, laid 
the plans for the TMT. Nowa consultant to 
the National Defense University in Wash- 
ington DC, Reed adds that the programme 
has become the archetype within the defence 
department for the “development of treat- 
ments for biologically engineered and natu- 
rally occurring disease threats”. 

“The TMT from its inception was a high- 
risk, high-payoff or high-failure effort,’ says 
David Hough, who became TMT programme 
manager in January 2007. He says that the 
effort has paid off: “If we get an engineered 
threat or something that we haven't seen 
before that is causing a lot of deaths, we 
think we can respond to that.” He says that 
the programme's track record is better than 
that of the Pentagon’s traditional chemical 
and biological defence research effort over 
the past decade. 

Although the TMT aimed to transform 
biodefence, it encountered many of the road- 
blocks that have hindered the nation’s bio- 
defence effort as a whole, which has spent 
$60 billion since 2001 with only modest returns 
(see Nature 477, 150-152; 2011). Developing 
broad-spectrum drugs for the battlefield has 
proved difficult because regulators are more 
accustomed to evaluating drugs that target one 
specific disease, and drug companies prefer 
to focus on diseases that affect many people 
rather than on obscure pathogens that could 
serve as bioweapons. 

These considerations helped to lead the 
TMT into focusing on influenza in 2009. That 
year, US government officials were faced with 
the double threat of H1N1 swine flu, which 
threatened to explode into a devastating pan- 
demic, and the more deadly H5N1 bird flu 
virus, which was continuing to infect small 
numbers of people. 

Government officials were “practically par- 
alysed by the fear that they were dealing with 
two strains at a time; they didn't know what 
they were going to do’, says Darrell Galloway, 
Rudolph’s predecessor at the Defense Threat 
Reduction Agency, who was a driving force 
for the TMT from its inception until he 
retired in January 2010. Galloway saw influ- 
enza as an opening to prove the programme's 
worth. In May 2009, he awarded a contract 
to AVI BioPharma of Bothell, Washington, 
to make a flu drug against the H1N1 virus, 
using its genetic sequence as a basis. Within 
months, the company had made a drug and 
tested it in ferrets. 

Yet the move angered some within the 
Pentagon and perplexed observers, because 


K. KULISH/CORBIS 
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influenza is the focus of considerable research 
funded by the US Department of Health and 
Human Services. “I'm having a really hard time 
making a connection between the investments 
were making and the benefit to soldiers,” said 
one staff member at the Defense Threat Reduc- 
tion Agency. 

The TMT also stumbled because companies 
attracted to biodefence tend to be small and 
inexperienced. Larger, established companies 
prefer to pursue more profitable markets, fear- 
ing that the federal government will commit 
to stockpiling only limited amounts of drugs 
developed for defence purposes. 

The company behind all three TMT drugs 
now in clinical trials, AVI BioPharma, has 
never had a drug approved by the US Food 
and Drug Administration. The company’s 
technology uses antisense, in which short 
pieces of genetic material bind to a pathogen’s 
genes and block their production. The tech- 
nology has led to few approved drugs owing 
to safety problems and a lack of efficacy. Still, 
the TMT and the Army awarded the com- 
pany a $291-million, six-year contract last 
year to fund two clinical trials, for its drugs 
against Ebola and Marburg viruses. Now AVI 
BioPharma has set its sights on a contract for 
clinical trials of its antisense drug for H1N1. 

AVI BioPharma’s chief executive, Chris 
Garabedian, says that the company’s technology 


THE COST OF COUNTERMEASURES 


The budget for the Transformational Medical 
Technologies programme has more than trebled 
since 2006. 
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is safer than that tested by other drug firms, and 
thus can be used in higher doses that are more 
likely to be effective than other antisense drugs 
that have failed in the past. 

But critics say that it was a mistake for the 
TMT to invest so much in a technology that 
does not have a proven track record in infec- 
tious disease. “Everybody in that field thinks 
antisense is a failure, except the [defence 
department] programme manager,’ says 
one biodefence analyst, who did not want 
to be named. 

Rudolph, who succeeded Galloway last 
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September, controls the chemical and 
biological defence research budget, which 
includes standard drug- and vaccine-research 
programmes as well as the TMT. Rudolph is 
combining the TMT research money (see 
“The cost of countermeasures ) with that for 
traditional projects, and refocusing on four 
priorities: surveillance and diagnostics, sen- 
sors, countermeasures and decontamination 
technologies. 

Rudolph has retained some TMT projects, 
such as the pathogen-sequencing studies 
led by Ian Lipkin of Columbia University in 
New York, who was a technical adviser on 
Contagion. But he has cut others, such as a 
five-year, $24.7-million contract awarded in 
2008 to Peregrine Pharmaceuticals of Tustin, 
California, to find antibodies against haem- 
orrhagic fevers. The TMT funding for AVI 
BioPharma’s two clinical trials will continue, 
however, as the trials are managed separately 
by Hough. 

Whether the dismantling of the TMT will 
improve the Pentagon's biodefence success 
rate remains to be seen, says Tom Inglesby at 
the Center for Biosecurity of the University 
of Pittsburgh Medical Center in Baltimore, 
Maryland. “In the end, the question will be, 
‘Did Rudolph make progress in the time 
he was there with the money that he had?’ 
Ultimately, he will be held accountable.” = 


© 2011 Macmillan Publishers Limited. All rights reserved 


ASTRONOMY 
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Hints of exo-Earths spark 
desire for a closer look 


Of the latest clutch reported, one is among the most Earth-like yet, another orbits two suns. 


BY ERIC HAND 


eoffrey Marcy finds it strange to turn 
ex away from a conference about 

planets beyond our Solar System. “This 
is a field that had three of us in 1995,” marvels 
Marcy, a pioneering exoplanet hunter at the 
University of California, Berkeley. 

But as an organizer of last week’s Extreme 
Solar Systems II conference, he had to decline 
applicants by the dozen. The Jackson Lake 
Lodge auditorium in Wyoming’s Grand Teton 
National Park was packed to its 330-person 
capacity as speakers announced a flood of new 
detections, and the air was alive with talk of a 
‘golden age’ of exoplanet astronomy. 

Along with the discoveries came some 
sobering news. Rocky, Earth-like planets may 
be less common than many hoped, and unex- 
pectedly ‘noisy’ stars are slowing the hunt. 
Moreover, astronomers cannot learn much 
beyond the basics — mass or size and orbit — 
of the planets they do find. “What we need is 
a telescope in space that can image and take 
spectra of truly Earth-like planets,’ Marcy says. 
“We still need that desperately.” 

For now, however, indirect methods are 
keeping astronomers busy. One trove of dis- 
coveries came from a European team that 
watches stars for the slight wobble that signals 
the gravitational pull of an unseen planet. 
Their instrument, the High Accuracy Radial 
velocity Planet Searcher (HARPS), which is 
attached to a 3.6-metre telescope at the Euro- 
pean Southern Observatory in La Silla, Chile, 
yielded 41 new planets, including one of the 
most Earth-like yet. At 3.6 times the mass of 
Earth, it sits in the ‘habitable zone around its 
star, the Goldilocks range of distances at which 
a planet's surface would be neither too hot nor 
too cold for water to be liquid. 

There was also news from astronomers work- 
ing with Kepler, a NASA space telescope that 
stares fixedly at a field of about 155,000 stars 
in search of transits: the very slight dip in the 
brightness of a star as a planet crosses in front. 
The Kepler team announced that they have now 
detected 1,781 candidate planets, including 123 
that are Earth-sized (see ‘Sizing up the sample’). 
Among the objects was a novelty: a circumbi- 
nary, or a planet orbiting a pair of stars. 

Both groups are now confident enough 
to start making pronouncements about the 


The Saturn-sized planet Kepler-16b (black) orbits two dwarf stars. It is not thought to be habitable. 


statistics of planets in orbit close to a star — 
the kind that both Kepler and HARPS are most 
sensitive to. The HARPS team, for instance, 
estimates that about half of Sun-like stars have 
at least one planet with an orbital period of 100 
days or fewer, and that many of these systems 
boast several such planets. Greg Laughlin, an 
astronomer at the University of California, 
Santa Cruz, points out that the Solar System — 
in which only Mercury has sucha short period 
(88 days) — might end up looking like the odd- 
ball. “The big news is that there's this staggering 
population of planets that youd never suspect 
from looking at our own Solar System,’ he says. 

Extrapolating from Kepler data, Wesley 
Traub, chief scientist of NASA’s exoplanet 
exploration programme, tooka stab at a more 
fundamental question: what fraction of Sun- 
like stars have rocky, Earth-sized planets far- 
ther from the star, in a habitable zone? His 
optimistic answer: 34%. 

But many observers dismiss these cal- 
culations as premature. Marcy points out 
that the Kepler statistics suggest a drop off 

in the frequency of 


> NATURE.COM planets smaller than 
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trouble detecting truly Earth-sized planets?” 
asks Marcy. “Or are they rare? We don't know.” 

The stars themselves are making the hunt 
difficult. Instruments such as HARPS detect 
the tug of a planet from tiny shifts in the fre- 
quency ofa star's spectral lines. Many research- 
ers hoped that a new tool for calibrating the 
position of those lines, called a laser frequency 
comb, would enable HARPS to detect the 
minuscule signal from a planet as small as 
Earth. But it seems that all but the most qui- 
escent stars have enough surface turbulence 
to drown out so small a signal. Similarly, the 
Kepler team has said that because of stellar 
noise, the mission will now need 8 years, rather 
than the originally planned 3.5 years, to detect 
all of the Earth-like planets around its target 
stars (see Nature 477, 142-143; 2011). 

Even after true Earth analogues have been 
detected, neither HARPS nor Kepler will be 
able to determine whether they have an Earth- 
like atmosphere containing, for instance, oxy- 
gen or carbon dioxide. For a few giant planets 
that pass in front of their stars, instruments 
such as the Spitzer Space Telescope have been 
able to analyse the spectrum of starlight shin- 
ing through the planet’s thick atmosphere, 
gleaning clues about its composition. But 
Marcy says that doing transmission spectro- 
scopy on the thin ring of atmosphere > 
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> surrounding an Earth-sized planet is 
beyond the reach of even the 6.5-metre 
James Webb Space Telescope, the orbital 
observatory that NASA hopes to launch by 
2018 if it does not fall victim to cost over- 
runs. “James Webb, yea or nay, is not the 
answer to our prayers,’ Marcy says. 

Asaresult, researchers such as Marcy and 
Traub stress the need to go beyond the indi- 
rect techniques of HARPS and Kepler, and 
gather the faint light from the planet itself, 
which is normally invisible in the glare of 
the parent star. For that, astronomers need 
either a giant space telescope equipped with 
a device for blocking starlight, or an inter- 
ferometer, consisting of several telescopes 
flying in formation. NASA did develop a 
proposal for such a space telescope, called 
Terrestrial Planet Finder, and the European 
Space Agency hoped to fly a similar mission 
called Darwin. But budgetary constraints 
have left both missions in limbo, unlikely 
to advance to the front of either agency's 
queue until well into the next decade. At the 
conference, Traub raised the issue. “People 
are not thinking deeply about the distant 
future. People are wrapped up with what 
they're doing right now,’ he says. “Clearly, 
I'm concerned.” 

But Laughlin isn’t as worried. He says that 
the enthusiasm and momentum in planet- 
hunting may lead to an unexpected solution. 
“People have a history of being inventive 
when they need to be,” he says. He points to 
French philosopher Auguste Comte, who in 
1835 wrote that astronomers might be able 
to learn about the shapes, sizes and motions 
of stars, but that stellar densities, tempera- 
tures and chemical compositions would be 
“forever denied to us”. 

Within three decades, astronomical 
spectroscopy was starting to answer all of 
those questions. m 


SIZING UP THE SAMPLE 


Since February, the Kepler team has increased 
its catalogue of candidate planets by 45%. In 
that time, the number of Earth-sized candidates 
has nearly doubled, from 68 to 123. 


123 
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(<1.25 Earth radii) 
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MISCONDUCT 


Austria reinstates 
disgraced doctor 


Physician at heart of retracted clinical trial can return to work. 


BY ALISON ABBOTT 


e carried out clinical trials without 
H ethical approval. He failed to provide 

raw data for his high-profile publica- 
tions. He falsified legal documents. But despite 
this record, an employment commission 
has ordered that Austrian urologist Hannes 
Strasser be readmitted to his teaching post at 
the Medical University of Innsbruck. 

Now the university is trying to find a way out 
of the embarrassing situation — one that high- 
lights the weakness and tardiness of Austria's 
system for dealing with research misconduct. 
“We are being forced by a legal decision to let 
him back when we think he has no place here,” 
says the university's rector, Herbert Lochs. 

The university suspended Strasser three 
years ago after serious concerns were first raised 
about his trial of a novel stem-cell therapy for 
urinary incontinence. The therapy relied on 
injecting stem cells and fibroblasts derived 
from the patients’ own tissue into the urinary 
sphincter; it had been developed by Innovacell 
Biotechnologie in Innsbruck, a company co- 
founded by Strasser, and with which he is no 
longer involved. But many patients reported no 
improvement after the therapy, and others claim 
that it caused their bladders to seal over. 

A subsequent investigation by the Austrian 
government’s Agency for Health and Food 
Safety (AGES) found a series of ethical and 
legal misdemeanours, which led them to con- 
clude that the trial was illegal and invalid (see 
Nature 454, 922-923; 2008). The Lancet with- 
drew Strasser’s paper reporting the trial’s results 
(S. Kleinert and R. Horton Lancet 372, 
789-790; 2008). 

But on 8 September the Vienna-based 
National Disciplinary Committee — which 
adjudicates on employment issues relating to 
civil servants (including university professors) 
— revoked Strasser’s suspension. The com- 
mittee based its decision on the outcome of a 
case brought to an Innsbruck court in which 
the university hospital sued Strasser and his 
department head Georg Bartsch for €1.2 mil- 
lion (US$1.6 million) — its estimated cost for 
giving Strasser’s treatment to 400 patients not 
involved in clinical trials. The court refused 
the claim on 3 August, stating that there was 
no proof that Strasser had intended financial 
deception. Bartsch was too ill to stand trial. 
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However, the court also stated that Strasser 
had provided false testimony during a 2008 
civil damages case brought by a patient who had 
received the treatment, and that he had falsi- 
fied evidence in the AGES investigation, which 
considered legal issues surrounding the clinical 
trials. It fined him €4,500. 

The university plans to appeal against 
Strasser’s reinstatement. In the meantime, the 
university has asked Strasser only to prepare 
unscheduled lectures: “We don’t want him to 
get into clinical work,” says Lochs. 

In separate legal cases, several patients who 
say the treatment harmed them are now try- 
ing to bring charges of grievous bodily harm 
against Strasser, according to the Innsbruck- 
based lawyer Thomas Juen, who has previously 
represented trial patients seeking damages. 

Earlier this year, the Medical University of 
Innsbruck completed its own investigation 
into the scientific aspects of the case, finding 
that Bartsch and Strasser had engaged in what 
Lochs views as “massive scientific misconduct”. 

The university's slowness in carrying out the 
investigation has been widely criticized. One 
academic, who asked to remain anonymous, 
said he believed that a timely, formal state- 
ment that Strasser had perpetrated serious 
scientific misconduct might have helped avert 
the disciplinary committee’s revocation of his 
dismissal. Juen adds that he is surprised that 
the disciplinary committee revoked Strasser’s 
suspension so quickly, “given that appeals are 
ongoing, and that looming cases of grievous 
bodily harm on the part of patients have not 
yet come to court”. 

Strasser did not respond to Nature’s request 
for an interview, and Bartsch was unavailable for 
commentas a result of his illness. Strasser, how- 
ever, was quoted in a local newspaper as saying 
that he wants to return to clinical practice and 
that the hospital administration may not be able 
to stop him. = 


CORRECTION 

The News story ‘Canadian ozone network 
faces axe’ (Nature 477, 257-258; 2011) 
stated that Environment Canada planned 
to cut 776 jobs. Although 776 employees 
will be affected by workforce changes, only 
about 300 posts are being eliminated. 
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> surrounding an Earth-sized planet is 
beyond the reach of even the 6.5-metre 
James Webb Space Telescope, the orbital 
observatory that NASA hopes to launch by 
2018 if it does not fall victim to cost over- 
runs. “James Webb, yea or nay, is not the 
answer to our prayers,’ Marcy says. 

Asaresult, researchers such as Marcy and 
Traub stress the need to go beyond the indi- 
rect techniques of HARPS and Kepler, and 
gather the faint light from the planet itself, 
which is normally invisible in the glare of 
the parent star. For that, astronomers need 
either a giant space telescope equipped with 
a device for blocking starlight, or an inter- 
ferometer, consisting of several telescopes 
flying in formation. NASA did develop a 
proposal for such a space telescope, called 
Terrestrial Planet Finder, and the European 
Space Agency hoped to fly a similar mission 
called Darwin. But budgetary constraints 
have left both missions in limbo, unlikely 
to advance to the front of either agency's 
queue until well into the next decade. At the 
conference, Traub raised the issue. “People 
are not thinking deeply about the distant 
future. People are wrapped up with what 
they're doing right now,’ he says. “Clearly, 
I'm concerned.” 

But Laughlin isn’t as worried. He says that 
the enthusiasm and momentum in planet- 
hunting may lead to an unexpected solution. 
“People have a history of being inventive 
when they need to be,” he says. He points to 
French philosopher Auguste Comte, who in 
1835 wrote that astronomers might be able 
to learn about the shapes, sizes and motions 
of stars, but that stellar densities, tempera- 
tures and chemical compositions would be 
“forever denied to us”. 

Within three decades, astronomical 
spectroscopy was starting to answer all of 
those questions. m 


SIZING UP THE SAMPLE 


Since February, the Kepler team has increased 
its catalogue of candidate planets by 45%. In 
that time, the number of Earth-sized candidates 
has nearly doubled, from 68 to 123. 
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MISCONDUCT 


Austria reinstates 
disgraced doctor 


Physician at heart of retracted clinical trial can return to work. 


BY ALISON ABBOTT 


e carried out clinical trials without 
H ethical approval. He failed to provide 

raw data for his high-profile publica- 
tions. He falsified legal documents. But despite 
this record, an employment commission 
has ordered that Austrian urologist Hannes 
Strasser be readmitted to his teaching post at 
the Medical University of Innsbruck. 

Now the university is trying to find a way out 
of the embarrassing situation — one that high- 
lights the weakness and tardiness of Austria's 
system for dealing with research misconduct. 
“We are being forced by a legal decision to let 
him back when we think he has no place here,” 
says the university's rector, Herbert Lochs. 

The university suspended Strasser three 
years ago after serious concerns were first raised 
about his trial of a novel stem-cell therapy for 
urinary incontinence. The therapy relied on 
injecting stem cells and fibroblasts derived 
from the patients’ own tissue into the urinary 
sphincter; it had been developed by Innovacell 
Biotechnologie in Innsbruck, a company co- 
founded by Strasser, and with which he is no 
longer involved. But many patients reported no 
improvement after the therapy, and others claim 
that it caused their bladders to seal over. 

A subsequent investigation by the Austrian 
government’s Agency for Health and Food 
Safety (AGES) found a series of ethical and 
legal misdemeanours, which led them to con- 
clude that the trial was illegal and invalid (see 
Nature 454, 922-923; 2008). The Lancet with- 
drew Strasser’s paper reporting the trial’s results 
(S. Kleinert and R. Horton Lancet 372, 
789-790; 2008). 

But on 8 September the Vienna-based 
National Disciplinary Committee — which 
adjudicates on employment issues relating to 
civil servants (including university professors) 
— revoked Strasser’s suspension. The com- 
mittee based its decision on the outcome of a 
case brought to an Innsbruck court in which 
the university hospital sued Strasser and his 
department head Georg Bartsch for €1.2 mil- 
lion (US$1.6 million) — its estimated cost for 
giving Strasser’s treatment to 400 patients not 
involved in clinical trials. The court refused 
the claim on 3 August, stating that there was 
no proof that Strasser had intended financial 
deception. Bartsch was too ill to stand trial. 
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However, the court also stated that Strasser 
had provided false testimony during a 2008 
civil damages case brought by a patient who had 
received the treatment, and that he had falsi- 
fied evidence in the AGES investigation, which 
considered legal issues surrounding the clinical 
trials. It fined him €4,500. 

The university plans to appeal against 
Strasser’s reinstatement. In the meantime, the 
university has asked Strasser only to prepare 
unscheduled lectures: “We don’t want him to 
get into clinical work,” says Lochs. 

In separate legal cases, several patients who 
say the treatment harmed them are now try- 
ing to bring charges of grievous bodily harm 
against Strasser, according to the Innsbruck- 
based lawyer Thomas Juen, who has previously 
represented trial patients seeking damages. 

Earlier this year, the Medical University of 
Innsbruck completed its own investigation 
into the scientific aspects of the case, finding 
that Bartsch and Strasser had engaged in what 
Lochs views as “massive scientific misconduct”. 

The university's slowness in carrying out the 
investigation has been widely criticized. One 
academic, who asked to remain anonymous, 
said he believed that a timely, formal state- 
ment that Strasser had perpetrated serious 
scientific misconduct might have helped avert 
the disciplinary committee’s revocation of his 
dismissal. Juen adds that he is surprised that 
the disciplinary committee revoked Strasser’s 
suspension so quickly, “given that appeals are 
ongoing, and that looming cases of grievous 
bodily harm on the part of patients have not 
yet come to court”. 

Strasser did not respond to Nature’s request 
for an interview, and Bartsch was unavailable for 
commentas a result of his illness. Strasser, how- 
ever, was quoted in a local newspaper as saying 
that he wants to return to clinical practice and 
that the hospital administration may not be able 
to stop him. = 


CORRECTION 

The News story ‘Canadian ozone network 
faces axe’ (Nature 477, 257-258; 2011) 
stated that Environment Canada planned 
to cut 776 jobs. Although 776 employees 
will be affected by workforce changes, only 
about 300 posts are being eliminated. 


THE CHANGING FAC 
MILITARY SCIENC 


Basic research funded by the Pentagon 
is facing an uncertain future. 


BY SHARON WEINBERGER 


Iraq, senior Pentagon officials called on the academic community 
to join a ‘Manhattan Project’ to counter these improvised explosive 
devices. By invoking the Second World War race to build the atomic 
bomb, military leaders seemed to be pushing for a massive investment 
in science that could, like the first nuclear weapon, turn the tide of war. 
Academics responded with a collective shrug. The Pentagon's grand 
rhetoric wasn’t matched with any great influx of funding for science, and 
it wasn't clear how any one technology could help fight a loosely organ- 
ized, deliberately low-tech enemy. Besides, says Julia Erdley, deputy sci- 
ence adviser to the Pentagon’s Joint Improvised Explosive Device Defeat 
Organization, “we are looking for near-term solutions”. 

Six years later, the Department of Defense (DOD) has spent more 
than US$17 billion on countering improvised explosive devices, but, as 
Erdley suggests, the vast majority of that money has gone on implement- 
ing known solutions such as stronger armour for vehicles and person- 
nel, not advanced research. Roadside bombs remain the single biggest 
killer of US and allied troops in Iraq and Afghanistan. Pentagon offi- 
cials now admit that there is no technological 


E 2005, as roadside bomb attacks were claiming ever more lives in 


distributed terrorist networks (see Nature 
471, 566-568; 2011). 

The failure to mobilize the scientific com- 
munity for the war on terror stands in stark 
contrast to what happened in the cold war, 
when Pentagon-supported science boomed, 
and was viewed as a crucial asset to counter 
Soviet technological prowess. Today’s mili- 
tary has to operate in much more ambiguous and complex environ- 
ments, in which ‘soft’ skills such as trust-building, intelligence-gathering 
and cultural insight may prove as decisive as any technological advan- 
tage. Given this new military reality, it is becoming less clear what sci- 
ence and technology research has to offer. 


As the key to winning 
wars shifts from 

atomic bombs, stealth 
bombers and satellites 
to community work, the 
value of science to the 
military is diminishing. 


BROKEN PROGRAMMES 

That uncertainty may help to explain what some now see asa lack of sus- 
tained Pentagon support for blue-sky basic science and a preference for 
applied research with a short-term pay-off. “We believe that important 
aspects of the DoD basic research programs 


‘silver bullet’ for preventing, detecting and dis- 
arming roadside bombs, and their Manhattan 
project rhetoric has long since been replaced 
with more sober talk of disrupting highly 


BEYOND THE BOMB 


Science and the military 
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are ‘broker’ to an extent that neither throw- 
ing more money at these problems nor simple 
changes in procedures and definitions will fix 
them,’ wrote the JASONs, a defence advisory 
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group made up of independent scientists, in the most recent publicly 
available assessment of Pentagon science and technology. (Completed 
in 2009, the JASON report was released to the public in May 2010.) 
On the surface, the Pentagon’s science base looks healthy enough, 
and it supports a vast array of research (see page 369). The science and 
technology budget, which consists of basic research, applied research 
and advanced technology development — budget categories 6.1, 6.2 and 
6.3 in Pentagon parlance — has fallen from its post-11 September 2001 
peak in 2005, when it reached some 
$14.7 billion per year (see ‘Rise and RISE AND FALL 
fall’). But most of that decline came 
in the advanced-technology cat- drop could go deeper and last longer. 
egory, not basic research. And the 16 
total still stands at about $12 billion 
a year, nearly twice the $6.8 billion 
budget of the US National Science 
Foundation, and much higher 
than defence science expenditures 
in Europe, where countries have 


2012 US$ billions 
co 


traditionally spent only a fraction ° 

of what the United States spends any 

on the military. In 2009, the most 2. m Vietnam War ends. Ii 
recent year for which figures are 0 

available, the members of European 1962 1968 1974 1980 1986 


Defence Agency — every country in 

the European Union except Denmark — spent an aggregate of only 
€2.26 billion (US$3.1 billion) in the ‘research and technology’ category, 
the vast majority of which goes to the development of advanced aircraft 
and other weaponry, not science. 

Pentagon research also had a champion in former US defence secre- 
tary Robert Gates, a one-time CIA director who had been president of 
Texas A&M University in College Station before he came to the DOD 
in 2006. 

For example, Gates was well aware that in many academic fields, nota- 
bly the social sciences, relations with the military have been fraught 
and often hostile since the Vietnam War (1955-75). In 2008, hoping to 
rebuild those ties, Gates proposed Minerva: a basic-science programme 
that would specifically focus on the social sciences. Gates saw Minerva 
as emblematic of military science’s changing mission. “The challenges 
facing the world require a much broader conception and application of 
national power than just military prowess,” he said in announcing the 
programme. “The government and the Department of Defense need to 
engage additional intellectual disciplines — suchas history, anthropol- 
ogy, sociology and evolutionary psychology.’ 


MAGNIFICENT SEVEN 
Beginning with the president's fiscal year 2012 budget request this past 
February, Gates set a target of 2% annual growth in the basic-science 
budget over the coming years, and pledged to hold the applied- and 
advanced-technology accounts steady. That was particularly heartening 
news for those disciplines to which defence funding is crucial. About 
one-third of all the funding for oceanography research and computer 
science in the United States comes from the Pentagon, for example, as 
does a majority of the funding for mechanical engineering (see Nature 
466, 656-657; 2010). “Physics research is no longer tied so exclusively 
to military funding,” says David Kaiser, a historian of science at the 
Massachusetts Institute of Technology in Cambridge, “although it still has 
a large role.” And the defence department is also now the largest single 
source of funding for research into traumatic brain injury (see page 390). 
Shortly before stepping down on 30 June this year, Gates signed off a 
new science and technology plan for the Pentagon. The policy includes a 
list of priorities — which Pentagon insiders immediately dubbed the ‘mag- 
nificent seven’ — to be used for budget planning over the next five years. 
And yet, Gates’s efforts also illustrate some of the many strains in the 
Pentagon's science and technology programme. Minerva, in particular, 
has met with decidedly mixed reactions, as academics question whether 


Pentagon spending on science and technology fell after the end of the 
cold war, then surged again during the war on terrorism. But the current 
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the Pentagon has any business setting the course of social-science research 
(see Nature 455, 583-585; 2008). And the magnificent-seven list, which 
includes topics such as research to counter weapons of mass destruction, 
and engineering resilient systems, arguably hasn't done much to inspire 
the scientific community. “It would be hard to categorize it as bold or 
prescient,” says Mark Lewis, an aerospace engineer at the University of 
Maryland in College Park anda former chief scientist of the US Air Force. 
It is essentially a compendium of the individual military services’ wish 
lists. A Pentagon spokesperson also 
says that there are no funding goals 
tied to the magnificent seven. 

A more fundamental issue is 
what many observers see as a lack of 
high-level vision and coordination 
for Pentagon research. In earlier 
decades, that coordination was car- 
ried out by the director for defence 
research and engineering (DDR&E), 
a position established in 1958 after 
the Soviet launch of the Sputnik 
satellite. Located in the Pentagon's 
power centre — the Office of the 
Secretary of Defense — this director 
oversaw all of the department's sci- 
ence and technology programmes. 

But in the late 1970s, the position ceded much of its authority over 
budget and policy to the under secretary for acquisition — the chief 
weapons buyer. The DDR&E, recently renamed the assistant secretary 
of defence for research and engineering, was left with a limited staff, 
overseeing a vast portfolio of science accounts at the individual services 
and the Pentagon-wide Defense Advanced Research Projects Agency. In 
recent years, the office has become marginalized, with its staff fending 
off spending cuts in the science budget, rather than being a driving force 
in military science policy. 

And even when the office succeeds in defending basic research, 
according to the 2009 JASON report, the research inexorably gets pushed 
towards immediate applications. In a sample of 258 basic-research pro- 
jects funded by the Air Force Office of Scientific Research in 2007, and 
a similar sample funded by the Army Research Office, the group found 
that as many as 81% “are not, even by a generous stretch, 6.1 research”. 

The JASONS urged the defence department to elevate and strengthen 
the DDR&E office, and make it independent of weapons acquisition. 
But the defence-department bureaucracy has given no sign that any 
such change is in the offing. 

In the meantime, the Pentagon faces a more urgent threat. “We're 
starting to see a downward trend in R&D funding,” says Todd Har- 
rison, a fellow at the Center for Strategic and Budgetary Assessments 
in Washington DC. The Obama administration has already asked the 
Pentagon to cut $400 billion from its budgets over the next 12 years 
— the current budget is about $700 billion per year — and there's no 
guarantee that those cuts won't be expanded as Congress struggles to 
trim the US federal deficit, or that the money won't come from the sci- 
ence and technology budget. 

Lewis sees Gates’s commitment to increasing basic-science spending 
as one of the most important changes in the Pentagon's science policy 
over the past few years. The question now, however, is whether Gates’s 
successor, former CIA director Leon Panetta, will uphold that commit- 
ment. Lewis points to the new defence secretary’s confirmation hearings 
on Capitol Hill, when he was specifically asked that question. Panetta 
replied that he valued basic research — but that “all defence appropria- 
tions must be considered during this time of budget constraints”. 

In other words, everything is on the table for cuts, including science. 
“That would be a profound change,’ says Lewis. m SEE EDITORIAL P.369 
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SHARED INTELLIGENCE 


The military has a vast array of 
scientifically valuable data — some 
more accessible than you think. 


BY GEOFF BRUMFIEL 


Thirty-six thousand kilometres above Earth, US Air Force sat- 

ellites watch for the heat plume ofa ballistic missile. An array of 
other surveillance satellites patrol lower altitudes. Some can see a rifle 
from space; others penetrate cloud cover with radar, seeking military 
hardware or installations. Still closer in, aircraft and drones fly over con- 
flict zones collecting intelligence, and seismometers listen for shudders 
from an underground nuclear test. Even the deepest oceans are prowled 
by military submarines, watching their foreign adversaries. 

Through most of their history, the data collected by this vast blanket 
of military sensors have been highly classified. But on occasions when 
scientists are lucky enough to see the data, their view is considerably 
different from that of the generals. Satellites designed to track missiles 
can also spot the flaming trails of meteors; aerial photographs of Iraq 
have allowed archaeologists to trace ancient canals. Even the military's 
most banal weather satellites collect data on ocean precipitation that are 
valuable for understanding Earth's energy cycles. 

After the cold war, some of these data did start trickling out to scien- 
tists, mainly in the United States, which has vast military resources and 
a vibrant scientific community. The flow ebbed after 2000 — but there 
are hints that it is resuming, and that more fruitful data collaborations 
are to come. A group of security-cleared scientists called MEDEA 
has recently rekindled ties with the US intelligence community to 
discuss the use of military environmental data for the study of climate 
change. And an agreement set to be finalized in October between 
NASA and the US Air Force will give astronomers unprecedented 
access to data on meteors entering the atmosphere. Some details of 
those data must be obfuscated to preserve state 


N: one monitors our planet more closely than the military. 


the University of California, San Diego, and a member of MEDEA. 

In the United States, the start of the Manhattan Project in 1942 set the 
tone for collaboration between the modern military and civilian scien- 
tists. The greatest physicists of the era, conscripted to build the atomic 
bomb, spent years working closely with the US Army. The Pentagon 
has used outside scientists to help shape its capabilities ever since. It 
maintains a handful of quasi-academic labs near university campuses, 
and a truculent panel of independent scientists — known as the JASONs 
— advises it on technical topics such as submarine detection and nuclear 
weapons (see page 397). 

At the same time, opportunistic collaborations have sprung up 
between civilian scientists and the defence establishment. With the 
advent of nuclear submarine warfare in the 1950s, the US Navy devoted 
enormous resources to mapping and understanding the sea floor — 
including mid-ocean ridges, where Navy mapping yielded clues to the 
theory of plate tectonics, according to Raymond Jeanloz, an Earth sci- 
entist at the University of California, Berkeley, and along-time member 
of the JASONs. Seismic networks used to monitor nuclear tests have also 
mapped earthquakes. Jeffrey Richelson, a historian at the National Secu- 
rity Archive in Washington DC, says that since the 1970s, the US defence 
department has occasionally shared satellite imagery with civilian agen- 
cies in response to natural disasters such as flooding and forest fires. 

But the military’s most sensitive data remained off-limits to academ- 
ics. In 1967, for example, early-warning radar in Alaska spotted pulsars 
— rotating stars that emit a pulsing radio signal — months before any 
civilian astronomers did. The staff sergeant who made the observations 
kept quiet about his discovery for 40 years, until the sightings were 
declassified in 2007 (ref. 1). 


secrets, but researchers say that the trove none- 
theless has enormous scientific potential. “I think 
it’s become more useful now than it ever has been 
before,” says John Orcutt, an oceanographer at 
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After the end of the cold war, restrictions began 
to loosen. In the mid-1990s, astronomers struck 
up an ad hoc arrangement with Air Force Space 
Command in which they could ask for data on 
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specific meteors that had been collected by missile-warning satellites. 
At around the same time, Al Gore, then a Democratic senator from 
Tennessee, began to ask what the intelligence community could offer 
climate scientists. Gore was interested in environmental issues and 
had also served on intelligence and military committees in Congress. 
He wrote to Robert Gates, then the director of the Central Intelligence 
Agency, prompting Gates to invite a group of scientists to gain security 
clearance and take a look at what the military had to offer. After Gore 
took office as Bill Clinton’s vice-president in 1993, the group solidified 
under the name MEDEA — Measurements of Earth Data for Environ- 
mental Analysis. 

“With the proper justification, I could ask for almost anything,” says 
William Schlesinger, a MEDEA member and president of the Cary Insti- 
tute of Ecosystem Studies in Millbrook, New York. Schlesinger used 
reconnaissance imagery going back to the Second World War to search 
for climate change’s influence on desertification of the Sahara (he didn't 
find any)’. 


TRADE SECRETS 

MEDEA did succeed in getting intelligence satellites to systematically 
photograph locations of environmental interest in the Arctic, Antarctic 
and the continental United States. In 1995, the group also successfully 
lobbied for the release of images from early photo-reconnaissance satel- 
lites Corona, Argon and Lanyard, which took more than 860,000 photo- 
graphs of Earth between 1960 and 1972, recorded on rolls of film. Since 
then, an entire cottage industry has sprung up involving archaeologists 
who search for roads and other ancient features in the photos, many 
of which show tracts of land that have since been consumed by urban 
sprawl. Jason Ur, an archaeologist at Harvard University in Cambridge, 
Massachusetts, for example, has used them to map massive canals dug 
by ancient Assyrian kings’ (see ‘Spying on an ancient city’). 

In the late 1990s, work by Gore and MEDEA led the United States and 
Russia to declassify Arctic-sea-ice data recorded between the 1970s and 
1990s by satellites, submarines and other sources. Scientists have since 
been able to use those data to reconstruct the gradual thinning of Arctic 
ice in the decades before civilian monitoring began. “Without the early 
classified data, people wouldn't have a clue,” says Ralph Cicerone, the 
president of the US National Academy of Sciences. 

Then, around 2000, MEDEA abruptly halted its work and, in 2009, 
the informal meteor data from the Air Force stopped flowing too. No 
one really knows why. But such twists and turns are the price of working 
with the intelligence community. As Schlesinger puts it, researchers aren't 
privy to the “darkened world where a bunch of people make a decision”. 


“WITH THE PROPER JUSTIFICATION, 


| COULD ASK FOR ALMOST ANYTHING.” 


Sharing will never be a priority for those charged with defending the 
United States, says Steven Aftergood, who heads the Project on Govern- 
ment Secrecy at the Federation of American Scientists in Washington 
DC and has spent decades tracking the US intelligence agencies. Even 
ifinformation is unclassified, agencies may not want to dole it out freely 
— or devote resources to converting it into formats that scientists can 
use. “No organization spontaneously discloses and shares its informa- 
tion; that’s just a bureaucratic law of physics,’ Aftergood says. Political 
pressure, such as that applied by Gore, is key to persuading intelligence 
agencies to share data, he says. 

These days, new collaborations are emerging. In 2008, congres- 
sional committees concerned about climate change quietly reconvened 
MEDEA to examine whether military- and intelligence-community 
assets could supply environmental data. The answer was yes, according 
to Cicerone, who has served as informal chair of MEDEA since 2008. 
Although intelligence satellites aren't as useful as custom-built instru- 
ments, the panel concluded that they could fill some gaps in climate data 
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SPYING ON AN ANCIENT CITY 


This image, acquired on 28 February 1966 
by the Corona reconnaissance satellite, 
reveals details of a capital built by Assyrian 
king Ashurnasirpal (883-859 Bc) on the 
Tigris River in what is now northern Iraq. 


gathered by civilian satellites, particularly given recent budget short- 
falls and launch failures such as the loss of the NASA Orbiting Carbon 
Observatory in February 2009 (ref. 4). 

Also in 2009, MEDEA persuaded intelligence officials to publicly share 
images of areas of environmental interest that had, by that time, been 
photographed regularly for more than a decade. The images are now 
archived as the Global Fiducials Library, available through the US Geo- 
logical Survey (USGS). Orcutt says they are “relatively priceless at this 
point” because they are gathered roughly once every few weeks — more 
frequently and continuously than those from civilian research satellites. 

Lindley Johnson, who oversees NASAs Near-Earth Object Observation 
programme, believes that the space policy unveiled in 2010 by US Presi- 
dent Barack Obama, which explicitly endorses data sharing, may have 
smoothed his efforts to secure data from the US Air Force. Johnson says 
the new arrangement, which will give astronomers access to data from 
missile-warning satellites on all meteors — not just the ones researchers 
knew about already — will allow scientists to gain a better understanding 
of the range of near-Earth objects in orbit. 

How much science will emerge from these burgeoning relationships 
remains to be seen. So far, the newly available image 
libraries of the Arctic and Antarctic have seen only mod- 
est use from scientists. “One of our biggest challenges is 
to educate the science community about the existence of 
our programme,’ says Bruce Molnia, executive director 
of the Civil Applications Committee at the USGS in Res- 
ton, Virginia, which oversees civilian use of classified image data. And 
the members of MEDEA, who have access to the full array of classified 
data, are, for now at least, using it to address policy questions raised by 
government agencies — such as what national security risks are posed 
by climate change — rather than conducting fundamental research of 
their own choosing. 

Yet Cicerone is hopeful that even more of the intelligence data being 
collected can eventually be shared. It is now feasible to save almost 
everything that the military’s eyes and ears are recording about Earth. “As 
scientists, we don't want observations to be thrown away, he says. “With 
the Earth, as time passes, you just get one shot at it.” mSEEEDITORIALP.369 


Geoff Brumfiel is a senior reporter for Nature in London. 
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The high-pressure shock waves generated by the detonation of roadside 
bombs can cause invisible damage to the brain. 


Wartime explosions may 

be creating an epidemic of 
brain damage — anda major 
challenge for scientists. 


BY SHARON WEINBERGER 
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o Burt, the blasts he expe- 
rienced in Afghanistan 
eventually became a kind 
of music. The detonation of C4 
and other such military-grade 
explosives felt like extremely high 
notes — painful, yet over quickly. 
But blasts from bombs made 
out of fertilizer — a favourite of 
Afghan insurgents — were like 
standing next to a speaker at a 
rock concert: the dull bass thuds 
didn't necessarily hurt, but they 
would reverberate through his 
body like a wave, and stay with 
him for along time afterwards. 

They’re with him still. Burt, 
who asks that his real name not 
be used, spent four months as a 
tactical adviser to a US military 
bomb-disposal unit in Afghani- 
stan, during which he was within 
50 metres of a detonating impro- 
vised explosive device (IED) more 
than 18 times. His sleeping prob- 
lems began even before he left. So 
did the headaches, the ringing in 
his ears and the nausea. He started 
to forget things — a problem that 
got even worse after he returned 
home. Burt would find himself 
in a room in his house and won- 
der why he was there. One time, 
he told his wife they should try 
a new restaurant in town. She 
replied that they had eaten there 
with friends just a few days before. 

As recently as two years ago, 
this constellation of symptoms 
might have been diagnosed as 
a classic case of post-traumatic 
stress disorder (PTSD), a psy- 
chological condition that can be 
caused by the constant stress of 
being in combat. But Burt, now 
on medical leave, blames those 
low notes. He is convinced that 
the body-shaking blasts did 
something to his brain. And many 
doctors, medical researchers and 
military officials have come to 
believe he is right. 

The visible toll of insurgent- 
made IEDs has been awful 
enough. In the ten years since 
military operations began in 
Afghanistan and then Iraq, IEDs 
have killed more than 3,000 US 
and allied troops, and wounded 
roughly ten times that number. 
But many more troops have been 


exposed to multiple blasts and 
not suffered any visible physi- 
cal injuries. Like Burt, they often 
report an array of symptoms, 
ranging from sleep disturbance 
to problems concentrating. And 
an increasing body of evidence 
suggests that the repeated con- 
cussions have left them with an 
invisible, subcellular-level form of 
traumatic brain injury (TBI) that 
not only impairs their day-to-day 
functioning, but also increases 
their long-term risk of developing 
neurodegenerative diseases. 
“We've got a lot of guys out there 
that might be 30 years old that have 


symptoms, could take years: some 
20 compounds and interventions 
have been tested in more than 50 
trials in the past 30 years. “Peo- 
ple just look at this field and turn 
around and run,’ Koroshetz says. 


PLAYING CATCH-UP 
The good news is that the Penta- 
gon has finally begun to put a high 
priority on understanding, diag- 
nosing and treating these injuries. 
But, as officials there now admit, it 
is playing catch-up after too many 
years of ignoring the problem. 
“The system of care was really 
in denial for the longest time,’ says 


“PEOPLE JUST LOOK AT THIS 


FIELD AND TURN AND RUN.” 


been blown up a dozen times,’ says 
Kevin Kit Parker, a biomedical 
engineer at Harvard University in 
Cambridge, Massachusetts, who is 
conducting research on TBI. “And 
the risk that these guys are going 
to get a disease like Alzheimer’s or 
Parkinson’ is soaring” 

The number of troops affected 
by this kind of silent TBI has 
already topped 200,000, according 
to the Defense and Veterans Brain 
Injury Center in Washington DC. 
A survey done by the Rand Cor- 
poration, a not-for-profit research 
firm in Santa Monica, California, 
suggests it could be as high as 
320,000. The Pentagon and the US 
Department of Veterans Affairs, 
which are responsible for the 
health care of current and former 
troops, respectively, are getting 
worried about a potential epi- 
demic of disability and dementia. 
The disorder also presents a major 
challenge for researchers. 

No one fully understands what 
the blast waves are doing to the 
brain, explains Walter Koroshetz, 
deputy director of the US National 
Institute of Neurological Disorders 
and Stroke in Bethesda, Maryland. 
Thanks to mounting evidence 
from professional sports, he says, 
“it’s been known for a long time 
that repetitive head injuries lead 
to chronic degenerative disease. 
But no one has really got ahold on 
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how that happens.” 
Worse, he says, com- 
ing up with an effec- 
tive treatment, and 
not just alleviating 


Colonel Christian Macedonia, a 
physician with the US Army who 
serves as medical-sciences adviser 
to Admiral Michael Mullen, chair- 
man of the Joint Chiefs of Staff. 
Partly this was just the culture 
of the military, says Macedonia: 
because most soldiers dazed by a 
blast wave seemed to recover very 
quickly — on the surface — the 
attitude was, “Hey, shake it off”. 

When the symptoms did begin 
to show, he says, troops with TBI 
were often misdiagnosed as hav- 
ing PTSD, which has similar 
symptoms. And veterans of Iraq 
and Afghanistan have all too often 
been exposed to physical and 
psychological traumas that could 
easily cause both. 

But most of all, Macedonia 
thinks that the reluctance to rec- 
ognize silent TBI was “the ghost 
of the Gulf War” — the ongoing 
scientific controversy around the 
diffuse symptoms described by 
many troops who served in the 
1991 conflict. Study after study 
has failed to identify a root cause 
for Gulf War syndrome, he says, so 
when people started coming for- 
ward with TBI — yet another con- 
stellation of complaints that could 
not be linked to a single cause — 
the frustrated military-medicine 
hierarchy just didn’t want to hear 
about it. 

That attitude didn’t begin to 
shift until senior military lead- 
ers began to sense a dissonance 
between the official reports they 
were being given and what they 
saw when visiting injured troops. 
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One crucial moment came in 2009 
when Marine Corps comman- 
dant General James Amos toured 
Walter Reed Hospital in Bethesda, 
Maryland, and was introduced to 
a patient who said, with consider- 
able effort, “General, I know who 
you are. I havea picture of you and 
I together in Iraq.” 
It turned out that Amos had 
a copy of the picture, too. It had 
been taken just two years earlier, 
when he had posed with a group 
of marines who had just survived 
an IED that had detonated directly 
under their vehicle. Thanks to the 
vehicle’s advanced armour, all 
of them seemed unscathed. But 
this young man, a bomb-disposal 
expert, went straight back to 
work and was quickly exposed to 
several more blasts. His physical 
condition deteriorated rapidly, 
his life began to unravel and — 
after some difficulty getting the 
military medical establishment to 
recognize his TBI — he had been 
admitted to Walter Reed with 
severe neurological problems. 
Amos describes the meeting as 
a seminal moment for him. “This 
TBI business is real, and we've got 
to get past the point of ignoring it? 
he recalls of his reaction. “We need 
to do something about it.” 
Mullen was coming to much 
the same conclusion. Concerned 
that he wasn’t getting a full picture 
of the brain-injury problem, he 
asked Macedonia to help organ- 
ize a ‘Gray Team’ of researchers 
and medical professionals with 
combat experience to look at the 
realities of TBI on the battlefield. 
The Gray Team (named after 
the brain's grey matter) made its 
first visit to Afghanistan in 2009, 
says Macedonia, and quickly 
concluded that Mullen’s suspi- 
cions were well founded. Official 
reports had claimed that more 
than 90% of troops with concus- 
sion were being assessed with the 
13-point Military Acute Concus- 
sion Evaluation (MACE). But 
when the Gray Team travelled to 
Afghanistan, the group found that 
the vast majority of medical pro- 
fessionals — in both large military 
hospitals and remote outposts — 
didn’t even know what a MACE 
was. “Doctors couldnt tell you the 
first thing about it, even though 
they had all the training materi- 
als,” says Macedonia. No one was 
enforcing the screening. 
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In parallel with the efforts 
of the Gray Team, the Defense 
Advanced Research Projects 
Agency (DARPA) and the Office 
of Naval Research were sponsor- 
ing a study that for the first time 
sought to understand how the 
brain is affected by blast waves, 
which may cause different inju- 
ries from the blunt-force trauma 
seen in sports injuries. The study 
focused on breachers: marines 
who specialize in using explosives 
to enter buildings. The first paper 
is only now going through review, 
but researchers say that they have 
found evidence of neurological 
impairment in the instructors, 
who have had long-term, repeated 
exposure to low-level blasts. 

On 21 June 2010, guided in part 
by the breacher study, the Penta- 
gon announced its first policies 
for identifying and treating peo- 
ple who may have TBI. Included 
were the first military-wide 
mandatory triggers for screening 
troops, including a rule that any- 
one within 50 metres of a blast had 
to be evaluated for signs of brain 
injury. 


THE RESEARCH SCRAMBLE 

The Pentagon has also started to 
make up for its long neglect of 
brain-injury research. The Depart- 
ment of Defense's Congressionally 


TRAUMA IN 
THE BRAIN 


Brains affected by 
blunt force and blast 
waves can show few 
outward signs 

of injury. Buta 
microscope reveals 
neurological 
abnormalities 
similar to those 
found in 
Alzheimer’s 
disease. 


Directed Medical Research Pro- 
grams, one of the major conduits 
for medical-research funding, 
provided no money specifically for 
TBI or PTSD between 1999 and 
2005. In fiscal year 2006, a small 
amount, US$3.7 million, went to 
PTSD, but TBI was not even listed 
as a research topic. In 2007, how- 


by the time of his second tour in 
2009, troops were encountering 
200-kilogram fertilizer bombs that 
could blow unarmoured vehicles to 
smithereens. As he puts it, only half 
jokingly, once people started trying 
to kill him with IEDs, “I figured I 
had better turn into some kind of 
neuroscientist”. 


“THIS TBI BUSINESS IS REAL, 


AND WE’VE GOT TO GET PAST 
THE POINT OF IGNORING IT.” 


ever, mounting reports of battle- 
field brain injuries persuaded 
Congress to allocate $150 million 
for TBI research, with another 
$150 million for PTSD research. 
That influx of money was 
enough to open the door to people 
such as Parker, one of the few 
medical researchers working on 
TBI who has combat experience. 
His research focus had been on 
cardiac cell mechanics. But in 2002, 
he served the first of his two tours 
of duty as an infantry officer in 
Afghanistan and began to see the 
effects of TBI on his fellow soldiers. 
The bombs then were still relatively 
small and unsophisticated — artil- 
lery shells hooked up to garage- 
door openers, for instance. But 
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In fact, Parker’s first formal 
involvement with brain-injury 
research began when he attended 
a DARPA workshop on the subject 
in 2005. There he learned that one 
of the challenges was to under- 
stand the effects of an explosive 
blast on the brain. With his back- 
ground in cell mechanics, Parker 
immediately began to wonder 
about integrins, receptors that 
mediate the cell’s attachment to 
surrounding tissue. Could a blast 
wave damage them enough to dis- 
rupt the proteins functioning? 

The idea got a cool reception 
at first, says Parker, who is nowa 
member of the Gray Team. “The 
community that does neuro- 
science and understands cell 
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mechanics is non-existent,’ he 
says. “It’s like if you’re used to 
reading English and I hand you 
a paper in Mandarin Chinese: 
it’s going to be kind of difficult.” 
But a grant from DARPA allowed 
Parker and his group to develop 
an in vitro model to test his idea. 
And in July, his team published 
a paper showing that the idea is 
essentially correct: blast-induced 
brain injury sets off a cellular 
chain reaction that disrupts inte- 
grin signalling, impairing connec- 
tions among the brain’s neurons 
(M. A. Hemphill et al. PLoS ONE 
6, €22899; 2011). 

The increased funding has also 
led to progress towards a blood 
test for diagnosing silent TBI. Cur- 
rently, clinicians can only infer the 
presence of such brain damage by 
cognitive-impairment tests. This 
means that, because the symp- 
toms overlap with those of other 
disorders, brain-injury research- 
ers can't always be sure about 
what they’re measuring — and 
patients might not be receiving 
the most appropriate care. Now, 
after looking at a variety of pro- 
teins that seem to become elevated 
in the bloodstream after a brain 
injury, army-funded researchers 
tested two that seemed especially 
promising in small-scale, phase II 
clinical trials. Known as ubiquitin 
C-terminal hydrolase (UCH-L1) 
and glial fibrillary acidic protein 
(GFAP), they will soon be tested 
in large-scale, phase III trials. 

Working independently of 
the Pentagon, Bennet Omalu, 
a forensic pathologist at the 
University of California, Davis, 
and the chief medical exam- 
iner for San Joaquin County in 
California, has started to look 
at veterans’ brains for chronic 
traumatic encephalopathy. First 
identified in professional ath- 
letes involved in contact sports, 
this neurodegenerative disorder 
is believed to be caused by mul- 
tiple concussions. In November, 
Omalu expects to publish what 
may be the first case study dem- 
onstrating chronic traumatic 
encephalopathy in a military 
veteran with silent TBI. 

The young man had been 
exposed to multiple blasts during 
two deployments to Iraq, explains 
Omalu, who in 2005 published 
the first evidence of chronic trau- 
matic encephalopathy, which he 
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The US military is experimenting with the use of electroencephalography during the baseline pre-deployment testing of its troops. 


had identified from autopsy sam- 
ples from an American football 
player (B. I. Omalu et al. Neuro- 
surgery 57, 128-134; 2005). After 
returning home, the man began 
to experience memory problems, 
mood disorders and self-control 
problems. Then, aged 27, he com- 
mitted suicide. With the permis- 
sion of his relatives, says Omalu, 
“I got his brain, examined it, 
and lo and behold, he had CTE 
changes” — abnormal accumu- 
lations of the tau protein associ- 
ated with Alzheimer’s disease and 
other dementias (see ‘Trauma in 
the brair). 


LIMITED ACCESS 

Few medical researchers work- 
ing on brain injuries have an 
easy way to collaborate with the 
Pentagon. Its unique combina- 
tion of bureaucracy and national- 
security considerations prevents 
access to many data and brain- 
tissue samples that could be use- 
ful for medical researchers. For 
example, access to the Pentagon's 
Joint Theater Trauma Registry 
—a compilation of all military 
trauma-related data — is highly 
restricted, lest enemies use the 
information to improve their abil- 
ity to injure US soldiers. “Giving 


the NIH access is not impossible, 
but it is very, very difficult, says 
Major General James Gilman, 
who heads the US Army Medical 
Research and Materiel Command 
at Fort Detrick, Maryland. 

There have been some signs of 
change. A joint programme by the 
US National Institutes of Health 
(NIH) and the Uniformed Ser- 
vices University of the Health Sci- 
ences, both in Bethesda, recently 
hired a neuropathologist spe- 
cifically to look at brain tissue of 
deceased troops, although access 
to the tissue is not yet guaran- 
teed. Also, the Pentagon and the 
NIH agreed in August to develop 
a database for TBI that is similar 
to the ones created for Alzhei- 
mer’s disease, autism and cancer 
research. The idea is to standard- 
ize data collection across studies 
so that researchers can compare 
results more easily. 

Among other things, such com- 
parisons should help investigators 
to get a clearer picture of how well 
TBI therapies work. They need as 
much help as they can get, says 
Koroshetz: for all the progress in 
understanding the causes and pro- 
gression of silent TBI, treatments 
remain elusive. Dozens of clinical 
trials have been done over the past 


two decades, looking at everything 
from antioxidants to hyperbaric 
oxygen. “No one has been able to 
figure out how to make a differ- 
ence,” says Koroshetz. “In terms 
of outcomes in patients, there is 
very little, ifany, evidence that any 
single thing works.” 

The Pentagon has come a long 
way from just three years ago, 
when TBI was mostly ignored. 
In January, it became mandatory 
for the military to track all con- 
cussive injuries, and troops now 
receive pre-deployment cognitive 
testing that can be used as a base- 
line in case they are later affected 
by concussion. Experiments with 
brain-wave measurements are 
also under way. And with the 
new reporting requirements, the 
military is creating what is likely 
to be the single largest repository 
of data on TBI. 

The question is how to keep the 
momentum going. That may prove 
difficult, given the United States’ 
mounting budget woes. After 
the initial boost in 2007, funding 
levels for TBI research dropped 
dramatically. In fiscal year 2011, 
the congressional appropriation 
specifically for the Pentagon’s 
brain-injury research is expected 
to be just $45 million. “Where's 


the interest, where’s the support, 
where's the national effort?” asks 
Colonel Dallas Hack, director of 
the army's Combat Casualty Care 
Research Program at Fort Detrick. 

Brigadier General Robert 
Thomas, the army’s assistant 
surgeon-general, hopes that the 
military’s involvement is now 
doing for research and treatment 
of brain injuries what it has done 
in the past for yellow fever, trauma 
care and medical evacuation. For 
better or worse, he says, “combat 
is the greatest catalyst to medical 
innovation”. 

But in the meantime, Burt 
and the hundreds of thousands 
of other people with brain inju- 
ries can only hope that progress 
comes in time to help them. Once 
an ambitious multi-tasker, Burt 
says he now has problems with 
basic tasks. These days, he can get 
around the house, and even man- 
age trips to the store — as long as 
he makes lists or uses some other 
form of reminder. “But I will 
never be what I was,” he says. m 


Sharon Weinberger is a 
Carnegie fellow at Northwestern 
University’s Medill School of 
Journalism. 
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Global security: US officers collaborating with Kenyan community health workers in 2011. 


Joining forces 


Civilians and the military must cooperate on global 
disease control, say David Blazes and Kevin Russell. 


cc ulnerability is universal,” wrote 
\ / Margaret Chan, director-general 

of the World Health Organization 

(WHO), in The World Health Report 2007. 
The words ring even truer today. Height- 
ened concern about the 2009 influenza 
pandemic, the rapid global spread of anti- 
microbial-resistant organisms and even the 
popularity of Contagion, a film featuring a 
lethal airborne virus, capture this sentiment. 
Global public health has become a 
national-security and foreign-policy issue. 
Rapid transportation of people, diseases and 
information has increased public-health 
threats — from emerging influenza strains 
to bioterrorism — that cannot be managed 
solely through conventional practices such 
as isolation and quarantine. Effective global 
disease surveillance, timely detection of 


outbreaks and appropriate responses that 
help to control epidemics are the essential 
tools of public-health security. 

Here, civilian organizations have much 
to gain by working with the military. While 
many public-health agencies struggle for 
funds, the militaries of various nations are 
investing in public-health security. Military 
scientific efforts towards characterization, 
prevention and vaccine development for 
emerging infectious diseases, for example, 
improve the lives of civilians as well as sol- 
diers, in peace and war. 

But tensions can arise from the different 
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priorities of civilian and military groups. 
Our experience leading US military disease 
surveillance activities leaves us convinced 
that such vital collaborations can succeed if 
there is transparency and trust on all sides. 


FROM SOLDIERS TO CITIZENS 

Armies have long worked to prevent their 
personnel from contracting or spreading 
diseases, in the process making seminal 
contributions to public-health security that 
also benefit civilians. Ronald Ross, a Brit- 
ish officer in the Indian Medical Service in 
the late nineteenth century, was the first to 
work out that Anopheles mosquitoes transmit 
malaria to humans. During the building of 
the Panama Canal at the start of the twen- 
tieth century, US Army researcher Walter 
Reed made discoveries about yellow fever 
that helped to control the disease and allow 
the completion of the construction, which 
opened new trade routes. US Army scien- 
tists developed vaccines for hepatitis A in 
the 1990s and hepatitis E in the 2000s’. And 
in 2009, working with local Thai officials and 
others, US Army scientists developed the first 
vaccine to partially protect against HIV’. 

Indeed, the US Department of Defense 
(DOD) dedicates hundreds of millions of 
dollars every year to understanding infectious 
diseases and pathogens worldwide. Since 
1997, the DOD Global Emerging Infections 
Surveillance and Response System (GEIS) 
has spent about US$54 million a year on 
emerging infectious diseases. It coordinates 
a network of institutes that includes research 
laboratories in Egypt, Cambodia, Peru, Thai- 
land and Kenya. Scientists in these labs have 
made breakthroughs including isolation of 
new pathogens, the first description of Plas- 
modium falciparum that are resistant to arte- 
misinin antimalarials and contributions to 
annual flu vaccines (including the seed strain 
for the 2009 H1N1 influenza A virus)’. 

The scope of DOD investment is broad. In 
addition to disease surveillance, it includes: 
enhancing global biosafety and securing 
existing high-risk biological agents; HIV 
prevention and treatment; and the develop- 
ment of diagnostics and vaccines for vector- 
borne infections and diarrhoeal disease. 
Several DOD laboratories collaborate with 
the WHO as reference laboratories, and 
with developing countries on topics includ- 
ing occupational health, human subject > 
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> research protection, electronic disease 
surveillance and outbreak response. 

These successes have not come easily. Some 
people are concerned that military engage- 
ment in public health shifts priorities away 
from health to security topics, even though 
security has been part of the WHO's remit 
since its inception — its constitution states 
that the “health of all peoples is fundamental 
to the attainment of peace and security”. Local 
officials and scientists sometimes hesitate to 
trust military public-health personnel, believ- 
ing that the military’s agenda is to protect its 
staff, citizens and allies ahead of others, or 
believing misinformation about the military’s 
engagement with biological weapons. 

Sometimes, open, mutually beneficial 
relationships are not possible. For example, 
in regions with active conflict, such as Iraq, 
US military officials conduct disease surveil- 
lance among its forces, but it is often difficult 
to focus on local health issues. But more com- 
monly, the military's aim is to maintain secu- 
rity for civilians and soldiers alike. A healthy 
society is more stable than an unhealthy one. 


THE WAY FORWARD 
The military can do much to build trust. 
When military scientists work with local 
scientists, by sharing projects and data 
and by jointly reporting results, they 
prove their commitment to transparency. 
By focusing on local diseases, they build 
relationships. It is in the military's interest 
to do so, because cosmopolitan diseases are 
more likely than exotic pandemic strains to 
affect populations, and widespread illness 
could compromise a region’s security. 
Local officials who engage with the mili- 
tary can harness a wealth of resources and 
expertise. Small pilot projects can help to 
build confidence among those on the ground. 
Transparency is a tenet of the Interna- 
tional Health Regulations. The regulations 
— agreed to by 194 countries in 2005, enter- 
ing into force in 2007 — set standards for the 


A GLOBAL NETWORK 


detection, diagnosis, reporting and control of 
a public-health emergency of international 
concern. The framework also encourages 
developed countries to assist other states in 
building these core capabilities, which the 
GEIS programme has endeavoured to provide 

through its efforts in developing countries. 
The US military’s commitment to transpar- 
ency was demonstrated in the 2009 H1N1 
influenza pandemic. The first cases were dis- 
covered by the Naval Health Research Center 
in San Diego, California, (a hub of the GEIS 
network) and were reported to the WHO 
through the US Centers for Disease Control 
and Prevention (CDC). Researchers across 
the GEIS network (see 


“Local ‘A global network) 
officials and assisted 14 other 
scientists nations in making 
sometimes their first diagnoses’. 

hesitate to The GEIS network 
trust military also promotes shar- 
public-health ing genetic data on 
personnel. ay potential pathogens 


freely through Gen- 
Bank submissions. In the past year, the GEIS 
network has deposited genetic sequences for 
more than 1,000 strains of influenza A from 
around the world, to increase worldwide 
representation in the WHO’s Global Influ- 
enza Surveillance and Response System. 
This open approach contrasts with the ‘viral 
sovereignty’ attitude adopted by some coun- 
tries, which in the past have not shared influ- 
enza samples because of inequitable access to 
diagnostics, vaccines or treatments derived 
from viruses originating in their country. 
In response, this spring, the WHO created 
a Pandemic Influenza Preparedness frame- 
work for virus sharing, benefits sharing and 
standard material transfer agreements. 

A model lab for scientific transpar- 
ency, institutional trust and effective pub- 
lic-health security is the Naval Medical 
Research Unit 3 (NAMRU-3) in Cairo. It 
was established in the 1940s to work with 


The US military supports public-health initiatives in many countries through its Global Emerging Infections 
Surveillance and Response System (GEIS) and the DOD network of laboratories. 
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the Egyptian Ministry of Health on the fight 
against typhus, at the time a cause of epidem- 
ics in Egypt and worldwide*. The lab has 
since become integral in studying a variety 
of infectious diseases such as food-borne 
and respiratory illness that affect locals and 
military personnel. The relationship was so 
valued that the lab was the only official US 
government presence to remain in Egypt 
during the Six-Day War in 1967. 

Work on H5N1 Avian influenza and other 
infectious diseases continues at NAMRU-3, 
where 250 Egyptian scientists and technicians 
work alongside 21 US military colleagues*. 
Just after the H1N1 influenza pandemic was 
declared in May 2009, NAMRU-3 trained 73 
scientists from 32 countries within 3 weeks on 
molecular diagnosis of this new strain with 
CDCassistance, regardless of country of ori- 
gin and focused only on underlying need of 
public-health assistance’. 

The development of an open-source 
software system for electronic disease sur- 
veillance is another military effort that has 
benefitted international and local disease- 
monitoring programmes. A partnership 
between GEIS and the Johns Hopkins 
Applied Physics Lab in Laurel, Maryland, 
created the Suite for Automated Global Elec- 
tronic bioSurveillance, which can use mobile 
phones to report cases of disease, by voice 
or text message, and then collate the data to 
inform public-health leadership’. This sys- 
tem has been piloted in Peru, the Philippines 
and Cambodia. It is being offered to all free 
of charge and with no requirement to share 
data, although sharing aggregate informa- 
tion with the WHO can be facilitated by the 
system and is encouraged. 

In this time of increasing global complex- 
ity and fiscal constraints, all components of 
society, including the military, should work 
together to secure global public health 
through transparent actions. The struggle 
between life and death plays out both on 
the battlefield and in the hospital. It is time 
we fought for global public-health security 
together. m SEE EDITORIALP.369 
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The world’s most independent 
defence science advisers 


Ann Finkbeiner explains JASON, the autonomous group of academics that has been 
reporting to the US government on military matters for more than 50 years. 


en years ago, for a short time, a 
| 40-year-long relationship between 
the US Department of Defense and 
JASON, a small, secret group of elite science 
advisers, came apart. The Defense Advanced 
Research Projects Agency, or DARPA, 
directed JASON to add three specific people 
as members. JASON replied that it selected 
its own members according to exact- 
ing criteria, which the three did not meet. 
DARPA pointed out that it was the chan- 
nel through which JASON subcontracted 
with others, plus the source of nearly half its 
budget, so JASON should accept the three, or 
else no channel and no money. JASON chose 
the latter. The relationship with DARPA was 
over and remains so. 

JASON would have been out of business, 
but for the flurry of phone calls and e-mails 
that then went up and down the defence 
hierarchy. The office at the top, that of the 
Secretary of Defense, sent a polite note to the 
office of the Assistant Secretary of Defense for 
Research and Engineering (ASDR&E), which 
sits just above DARPA: “Please look into this 
JASON issue and see if it makes sense to retain 


them.” The ASDR&E thought it did and pro- 
vided JASON with a new channel and new 
money. Asa result, JASON’s position as a free- 
lance government adviser now seems secure. 

Here's the point: if you're a government 
and you want defence science advice that 
has no possible self-interest and that you 
can trust to be as close to the truth as nature 
allows, then you want an adviser that is inde- 
pendent enough to divorce you. 

JASON is autonomous and it isn't obliged to 
please; its reports are technical and famously 
neutral. It has managed to stay in business for 
50 years by maintaining a moving balance 
between usefulness and independence. To 
meet the government’s evolving national- 
security problems, JASON adjusted its 
scientific expertise. And to keep the JASONs 
in JASON — even as members changed from 
theorists with summers off to scientists with 
labs to run — JASON remained its own boss. 
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JASON comprises 30-40 scientists, mostly 
stellar academics, usually with broad inter- 
ests, all with top-secret clearances. The 
scientists meet for six weeks every summer 
in La Jolla, California, in what they say are 
grubby little offices, to answer questions 
from five to ten sponsors, all government 
agencies. Questions must be well defined, 
specific, answerable and useful. So, as the 
JASONS say, no “standing around admiring 
the problem”. And answers must be techni- 
cal, that is, based on equations and argu- 
ments from scientific first principles. JASON 
occasionally does experiments but declines 
to provide examples. It doesn't touch policy. 

About 40% of the questions it tackles 
come from a changing stable of sponsors 
— including the Department of Homeland 
Security, for which JASON has done studies 
on the detection of radiological material on 
cargo ships. The other 60% is split between 
the Department of Energy (on the nuclear 
stockpile, which JASON has so far judged 
to be disease-free); the ASDR&E (on the 
unpredictability of rare events such as terror- 
ist attacks, given a surprising lack of good > 
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> data sets); and the intelligence community 
(we're unlikely ever to know). Half to three- 
quarters of JASON’s studies are classified. 
JASON currently has more requests for stud- 
ies than it can accept. 

Governments have many ways of getting 
advice from scientists, but JASON is unique. 
The other scientists advising the defence 
department are chosen by the department 
and often come from the defence industry. 
The National Academy of Sciences is self- 
selected and highly respected, but it prefers 
to work on unclassified studies, doesn't do 
its own research and its reports can take 
years. Organizations such as MITRE are sci- 
ence advisory corporations who work only 
for the government. The closest analogues 
to JASON in size and speed are a handful of 
elite groups of academics — the US Infor- 
mation Science and Technology group and 
the UK Blackett Group, for instance — that 
advise specific government sponsors, such 
as DARPA or the UK Ministry of Defence. 
But these sponsors created the groups, help 
to choose their members and the members 
typically rotate out after a few years. 

JASON created itself: in January 1960, 
around 20 atomic physicists, mostly theo- 
rists, got together to advise a post-Sputnik 
government on nuclear matters. Research- 
ers have always been invited to join only by 
JASON’ self-appointed management. They 
stay on as active members or as senior advis- 
ers as long as they like, unless asked to quit by 
management. And (because everyone asks) 
JASON is not an acronym. It isa name given 
to the group by Mildred Goldberger, wife of 
founding member Murph Goldberger, after 
the Greek myth, because she thought of the 
advisers as golden heroes. 

JASON’s autonomy raises the question 
of how the group manages to stay in busi- 
ness. One reason is its members’ longevity: 
JASON can provide its sponsors with a kind 
of corporate memory. Incoming head Gerald 
Joyce, a biochemist at the Scripps Research 
Institute in La Jolla, has been a member for 
14 years and hasn't hit the median yet. It 
takes that long, he says, to “know who’s on 
the chess board and how things are done”. 
Otherwise, like any freelancer with a cadre 
of repeat customers, JASON ensures that its 
expertise and its customers’ questions are, 
as JASONs say, impedance-matched — an 
electronics term for complementary cables 
that allow the fullest flow of current. 


NUCLEAR TO NUCLEUS 

In its first decade, JASON’s studies were for 
DARPA and were largely related to cold-war 
problems of nuclear test bans and missile 
defence. They addressed questions such as 
‘could the right satellite detect the infrared 
signature of an enemy missile’s launch?’ By 
the early 1970s, DARPA had branched out 
beyond physics to materials and computer 


sciences. After some acrimony, JASON 
began adding non-physicists. 

At the same time, it also began to work 
for other sponsors — including the Central 
Intelligence Agency, NASA and the new 
Department of Energy. For these it stud- 
ied, for instance, pollution from supersonic 
jetliners and primitive models of climate 
change. Through the 1970s and early 1980s, 
another new sponsor, the US Navy, asked 
for studies on the internal ocean waves left 
by submarines; on the use of extremely long 
radio waves to communicate with subma- 
rines at great depths; and ona technique that 
became ocean acoustic tomography. Accord- 
ingly, by the end of the 1980s, members 
included computer scientists, astronomers, 
geoscientists, mathematicians, materials 
scientists, engineers and oceanographers. 

Current studies include cybersecurity, 
defences against improvised explosive devices 
and pharmaceutical intervention in cogni- 
tion. Notably, several projects have been on 
defending against biological weapons. JASON 
now has what current head, Roy Schwitters, a 
physicist at the University of Texas at Austin, 
calls “a crackerjack biology crowd” In fact, in 
the 1990s, when JASON began to add biolo- 
gists, it had to run cross-cultural training ses- 
sions, even teaching physicists to sequence 
DNA. Since then its studies have included: 
ways to counter bioweapons designers who 
might use genetic engineering to produce 
deadly microbes; possible links between the 
Navy’s sonar exercises and mini-epidemics 
of beached whales; and, most recently, 
the potential benefits of analysing genetic 
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information from all military personnel. 
Today, only half of the JASONs are physicists. 

JASON’s impact is difficult to gauge. It 
does not formally track its studies’ outcomes; 
most reports disappear into classified pro- 
grammes or are lost in the niceties of the 
governmental decision-making process. 
The JASONs feel that their niche is advising 
on classified science, which cannot be sent 
out for peer review, is done in small, isolated 
communities of government scientists and, 
as a result, risks straying out of the realm of 
reality. The JASONs speculate that they’ve 
saved the government billions of dollars in 
non-working technologies. 

Government sponsors would rather not 
talk about JASON’s influence except in gen- 
eralities. They say that the JASONs are often 
arrogant and naive but also scientifically 
smart, reliably objective and not subject to 
political expedience. They say they don't ask 
JASON unless they really want the answer, 
and will offer only hypothetical examples. Say 
the director of an agency wants to develop a 
technology, but it turns out not to work. He 
asks his agency’s science advisory board for 
help. The board hesitates to nix the technol- 
ogy, so it recommends further research. The 
director suspects the board is being polite. He 
asks JASON because, as one sponsor put it, 
giving an unworkable technology to JASON 
is like throwing raw meat to a lion. 

Here’s a real case: on JASON’s advice, the 
ASDR&E killed a programme to develop a 
rail gun for the army. JASON had found that 
a gun powered by an electric current could 
shoot projectiles along a pair of metal rails 
farther and faster than conventional artillery, 
but making it small enough to put ona tank 
would require too many miracles. 


AVISIBLE IMPACT 
The exception to the general opacity about 
JASON’s impact is a series of studies the 
group has been doing since its birth on the 
technologies underlying nuclear test bans. 
The first JASONs — who were, or were 
trained by, scientists involved in the Man- 
hattan Project — inherited a need to control 
nuclear weapons. Early on, JASON worked on 
various ways to verify whether a country was 
cheating on a nuclear treaty by testing weap- 
ons. For instance, could you tell the difference 
between the seismic signature of a nuclear test 
and that of an earthquake? Would a test in an 
underground cave muffle the explosions seis- 
mic signature? Later, JASON worked on ways 
to find out whether the nuclear stockpile had 
become old and ineffectual, and therefore an 
unreliable deterrent. In 1995, a JASON study 
ruled that nuclear weapons could be judged 
reliable without being tested, allowing the 
administration of President Bill Clinton to 
sign the Comprehensive Test Ban Treaty. 
Current studies, done for the Department 
of Energy, continue to focus on the nuclear 


stockpile. JASON now sees a healthy 
stockpile as obviating the need for design- 
ing new weapons and as an alternative 
to resuming underground testing. One 
such study helped to doom the push, by 
the administration of former president 
George W. Bush, to develop a new bomb: 
the Reliable Replacement Warhead. Over 
the years, JASON’s studies have vetted the 
national laboratories’ nuclear stockpile 
programmes enough times that Congress 
occasionally mandates that a programme 
cant be re-funded until JASON reviews it. 

The JASONSs feel strongly about this 
work. Scientists’ role in maintaining pub- 
lic confidence in the nuclear deterrent, 
says Schwitters, is “incredibly important”. 
Joyce agrees: “We feel responsible to the 
heritage.” 

Today, the group’s biggest challenge is 
keeping the JASONS attending regularly 
— they call it the ‘sticking coefficient’. 
The JASONs are paid a large amount — 
US$850 per day in 2004 — although they 
could make ten times more consulting 
for industry. They also admire each other 
and take pleasure in working together on 
topics new to all of them; they say it is like 
being in graduate school again. And their 
interest in the country’s security is intense. 

But working for six weeks of an aca- 
demic’s precious summer, summer after 
summer, carries costs to his (just 10% of 
the JASONs are women) research, career 
and family life. So some JASONs come to 
La Jolla for just four days a week, or come 
one summer and not the next, or show 
up only for certain studies. Nobody likes 
that: too much of JASON’s value to both 
its sponsors and its members depends on 
long, argumentative interactions. “It can't 
be done casually,” says one member. 

When Joyce becomes head in the 
autumn, he will survey JASON’s exper- 
tises — which he thinks of as finding key- 
words for each member — and ask regular 
sponsors which keywords they need most. 
He will tell the sponsors that the JASONs 
don't think they can usefully do social- 
science studies (occasionally requested 
for understanding insurgents and terror- 
ists) and remind them that JASON doesn't 
advise on policy. 

“We work with the sponsor to find the 
right study,” he says. “We operate on bill- 
able hours, and our budget is the sum of 
what we're doing. We're sort of a collec- 
tive independent contractor. There are 
others out there, but this model is very 
powerful? m= 
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An RQ-4 Global Hawk unmanned aerial vehicle before a mission in southwest Asia in November 2010. 


A world of killer apps 


Leaders are ill-prepared for the ethical complications 
of new ‘killer applications’, says P. W. Singer. 


president arguing that his nation isn't 
At war because his forces are using 

only robotic weapons. An arms- 
control meeting forlornly trying to ban the 
development of armed autonomous robots. 
Criminals using tiny robotic helicopters in 
a jewellery heist. These are not tales from an 
Isaac Asimov novel; they are real events that 
happened within the past year. 

From gunpowder to the atomic bomb to 
robots, history is full of weapons technolo- 
gies so disruptive that they change the rules. 
These deadly applications, or ‘killer apps, 
often begin in the military sector but have 
ripple effects beyond their intended uses. 
The Manhattan Project to develop the first 
atomic bomb was at its core a military- 
funded experiment to bundle the greatest 
explosive power into the smallest delivery 
package possible. 

But that research opened up entirely new 
areas of physics, revolutionized the energy 
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industry and transformed world politics. 

What is different today is the speed with 
which our technology can outpace our ethi- 
cal and policy responses to it. Astounding 
advances grab the headlines so frequently 
that the public has become numb to their 
significance — whether it is robotic planes, 
directed-energy weapons such as high- 
energy lasers, or ‘electric skin, tiny sensors 
that are applied to the body like tattoos. 

We are “giants” when it comes to tech- 
nology, but “ethical infants” when it comes 
to understanding its consequences, as US 
Army general Omar Bradley remarked 
in 1948. Bradley was referring to nuclear 
research, but as the pace of technologic 
change takes off, that gulf — between our 
sophisticated inventions and our crude grasp 
of the consequences — continues to widen. 
We need to start bridging it. 


1, ROBOT 

Robotics is an excellent case study of this 
gulf. Over the past ten years, the United 
States and 45 other nations have gone from 
looking at robots as mere science fiction to 
using them in their military forces. For > 
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Left, new legs and eyes: a mock up of the New BigDog, intended to carry equipment for ground troops; right, a US soldier prepares an RQ-11 Raven in Iraq, 2006. 


example, the US military used only a 
handful of unmanned aerial systems in the 
2003 invasion of Iraq, but now has more than 
7,000 unmanned aerial systems and 12,000 
unmanned ground systems in its inven- 
tory. Asa sign of things to come, the US Air 
Force now trains more unmanned-systems 
operators than fighter and bomber pilots 
combined. 

The effect of this shift goes beyond pilots’ 
lives saved. US President Barack Obama 
recently argued that he did not need con- 
gressional approval for military operations 
in Libya because they were carried out by 
unmanned aerial systems such as the MQ-1 
Predator and the MQ-9 Reaper. In Pakistan, 
US unmanned systems have made more 
than 250 strikes against suspected terrorists 
since 2004. Notably, these strikes are carried 
out by CIA drones rather than military ones, 
meaning even less oversight. The number of 
US drone strikes last year alone was several 
times larger than it was in the opening round 
of the Kosovo war, but — unlike that war — 
there has been no congressional authoriza- 
tion and little public debate. 

The growth in non-military uses of robot- 
ics, especially those developed originally for 
the military, also raises ethical issues. Police 
departments in cities such as Miami, Flor- 
ida, and Ogden, Utah, have sought special 
licences to operate unmanned aerial surveil- 
lance systems. This past spring, Congress 
legislated that US civilian airspace should 
be opened to allow more widespread use 
of such systems by 2015. This will mean a 
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boom for the robotics industry, but it will 
also raise new challenges to legal concepts 
such as privacy or probable cause for search 
or arrest. Police once needed warrants if 
they wanted to peek over citizens’ fences; 
now they have the technology to do it from 
above, over an entire city. As one federal dis- 
trict court judge told me, this is “a Supreme 
Court case waiting to happen”. 

History shows us that neglecting to 
address these issues of law and ethics can 
have immense consequences. Using a sub- 
marine to attack shipping, for example, was 
once science fiction. When it became reality, 
the dispute over ‘fair use’ of such technology 
drew the United States into the First World 
War, ultimately leading to the nation’s rise as 
a superpower. 


CODE OF ETHICS 

Today, the US Air Force has argued that its 
unmanned spy planes, if targeted by radar, 
have the same right to defend themselves 
with ammunition as its pilots have. This 
conferral on unmanned systems of the right 
to pre-emptive ‘self’-defence makes sense 
from one perspective, but could also be a 
legal-dispute-turned-international-crisis 
in the making, as well as a huge (and prob- 
ably unintentional) first step for the cause of 
robots’ rights. 

The importance and urgency of such com- 
plex challenges demands cross-disciplinary 
discussion — among technology research- 
ers and manufacturers, customers and 
users, regulators and policy-makers, social 
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scientists and philosophers. But traversing 
the boundaries between those sectors still 
feels like crossing between foreign lands. 

A major reason for this is insularity. Aca- 
demic journals of each field focus inward, 
professional conferences are attended only 
by the like-minded, and those who attempt 
to straddle disciplines or engage the public 
are viewed as ‘less serious’ In robotics, a 
striking example of this disconnect comes 
from a survey of the 25 stakeholders who 
most shape the field, conducted by the 
field’s professional trade group, the Asso- 
ciation for Unmanned Vehicles Systems 
International based in Arlington, Virginia. 
Asked whether they foresaw that the con- 
tinued development of unmanned systems 
might bring ‘any social, ethical, or moral 
problems’, 60% of these leaders answered 
with a simple ‘No. I experienced this head- 
in-the-sand attitude when a professor 
sent me an angry e-mail after a talk I gave 
at a leading engineering school. He chas- 
tised me for “troubling” his students “by 
asking them to think about the ethics of 
their work”. 

In turn, our policy leaders are ill-prepared 
for the questions and debates that inevitably 
follow technological developments. Those 
responsible for funding and deployment 
decisions often fail to understand even the 
basics of the technology they’re consider- 
ing. I witnessed this when a senior adviser 
to the US defence secretary expressed sur- 
prise to me that the United States was using 
“so many” robotic systems (even though 
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he drove the budget that paid for them), 
and then told me how he thought a three- 
dimensional version of the Internet might 
be possible “one day”. He spoke about virtual 
worlds as if they were an exotic concept like 
time travel, apparently unaware that they 
already exist. 

Similarly, when I gave a talk last year to the 
strategy office at the US Pentagon on some 
of the military, policy, legal and ethical rami- 
fications of the growing use of robotics, one 
senior officer asked me: “Who is thinking 
about all this stuff?” I replied: “Everyone 
thinks it’s you!” 


BRAVE NEW WORLD 

It doesn't have to be this way. Our academic 
training still follows the specialized model. 
Top researchers in artificial intelligence may 
go through their entire university educations 
without taking a single class on ethics, his- 
tory or law. 

In turn, there are public-policy under- 
graduates, international-law professors and 
philosophy doctoral students writing essays, 
articles and dissertations on military drones 
without having seen one, learned how it 
works or even interviewed anyone who has. 

No future scientist or policy-maker should 
graduate so ill-equipped. We can and must 
start training students to engage with com- 
plex multidisciplinary problems, by requir- 
ing those in the sciences to take courses in 
the humanities and vice versa. 

At a public policy level, we need a new 
approach to handling major programmes of 
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Clockwise from left: a device that fits in a backpack; an MQ-1 Predator; information gathering in Iraq; TALON robots can be used for bomb disarming or combat. 


technology research, which always includes 
an exploration of each one’s broader rami- 
fications outside the lab. If environmental 
impact surveys are mandatory to begin 
construction of new laboratory buildings, 
why are no similar ‘ethical, legal, and social 
implications’ (ELSI) studies required of the 
research that goes on in them once built? A 
better model is the one used by the Human 
Genome Project, which set aside up to 5% of 
its annual budget for ELSI discussions. 

We have to be real- 
istic about what such 
studies can achieve. 
They don’t solve all 
the tough problems, 
but they can provoke 
debates that will help us 
identify the true issues. 
Today, for example, sci- 
entists recognize that 
their work in genetic 

testing has implications in areas such as 
health care or privacy, and policy-makers 
are aware that the field is potentially power- 
ful. But no one is wasting time on unrealis- 
tic arguments about, for example, cloning 
Super Soldiers. The debates on the impli- 
cations of genetic testing are not always 
resolved, but the tenor and content of the 
discussion — in both the lab and the policy 
spheres — are much improved. Yet genetics 
is the exception to the rule. 

By comparison, those working on killer 
apps in robotics and other cutting-edge 
research fields should be asking themselves 
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questions such as: from whom is it ethical 
to accept research and development money? 
What attributes, such as weaponization, 
autonomy or intelligence, should I design 
into my technology? Which organizations 
and individuals should be allowed to buy 
and use my technology? Who should own 
or be able to access information gathered 
by my technology? If someone is harmed 
in association with the technology, who is 
responsible, and how is this determined? 

Yet, unlike future medical professionals, 
researchers seeking answers to these ques- 
tions received little training on ethics in 
graduate school and have no professional 
code or support structure to turn to. Policy- 
makers and legislators should also be better 
prepared to deal with the issues posed by 
taking a killer app beyond the lab. 

We must get cracking. More killer apps 
are coming, and they’ll bring a host of 
grand possibilities and perils with them. 
Mathematician-turned-satirist Tom Lehrer 
once wrote: “Once the rockets are up, who 
cares where they come down? That’s not my 
department; says Wernher von Braun” 

Until we start learning how to wrestle with 
the implications of our technologies, the joke 
will be on the rest of us. m SEE EDITORIALP.369 


P. W. Singer is director of the 21st Century 
Defense Initiative at the Brookings Institution, 
Washington DC 20036, USA, and author of 
Wired for War: The Robotics Revolution 
and Conflict in the 21st Century. He can 

be contacted through www.pwsinger.com. 
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Oil supplies have remained steady for 40 years despite disruptions such as the Kuwaiti well fires in 1991. 


Burning desires 


An obsession with oil distorts an account of the 
security of energy supplies, argues Vaclav Smil. 


n incessant flow of energy is the basis 
Ae modern civilization, so a secure 

energy supply — particularly the 
availability of oil — is inevitably the focus 
of much public and media interest. Energy 
expert Daniel Yergin duly focuses on the 
past, present and future supply of crude oil 
and on concerns about the security of the 
fuel’s supply. But with his narrow focus on 
oil, he passes up the opportunity to delve 
more deeply into our energy challenge. 

In The Quest, Yergin, chairman of the 
US consultancy IHS Cambridge Energy 
Research Associates, ranges over the history 
of modern oil and gas production and elec- 
tricity generation, the security of petroleum 
supplies and the evolution of concerns about 
global warming. He deals with key episodes 
of modern oil development, such as increas- 
ing Russian production, rising Chinese 
demand, supply disruptions, the controversy 
over peak oil production and unconventional 
resources such as oil shale. Yergin then turns 
to global warming and 
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The book, which is 
sliced into more than 
400 short sections, 
covers policy and eco- 
nomics more than sci- 
ence and technology. 
Its analysis of energy 
sources is uneven. 
Coal, for example, war- 
rants a single page. Yet 
during the twentieth 
century, coal supplied 
the world with more 
energy than did oil. In 
2010, coal combustion 
accounted for 30% of 
all global commercial 
energy (compared 
with nearly 34% for oil) and 40% of electricity 
generation. 

Yergin is no catastrophist. He presents 
ample evidence to counter the notion that 
we are running out of oil: new discoveries, 
exploitation of additional reserves in existing 
fields and unconventional oil resources will 
maintain the flow for the foreseeable future, 
he says. Globally, the market has remained 
well supplied despite the comings and 
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goings of dictators and ayatollahs, and major 
disruptions in output. 

Since the early 1970s, there have been 
many such disruptions, starting with the 
embargo by the Organization of the Petro- 
leum Exporting Countries (OPEC) in 
1973-74, and the decline of extraction in 
the United States, which was the world’s 

largest producer 


“Particularlyin —_ until 1975. These 
rich countries, were followed by 
energy security _ the Iranian revolu- 
depends more tion in 1979; Iraq’s 
on using ‘fuel takeover of Kuwait 
and electricity in 1990; the demise 


of the Soviet Union 
in 1991; the rise 
of oil imports by China, which was a net 
exporter of oil until 1994; the US invasion of 
Iraq in 2003; and, most recently, the Libyan 
civil war. 

Through all of this, global oil extraction 
rose by two-thirds, from 2.3 billion tonnes in 
1970 to 3.9 billion tonnes in 2010. Adjusted 
for inflation, crude oil is cheaper than it was 
30 years ago, and in many countries, gov- 
ernments take a larger chunk of the price of 
petrol in tax than goes to the demonized 
OPEC or multinational oil companies. 

Nevertheless, Yergin is sufficiently wor- 
ried about maintaining an undisrupted oil 
supply that he feels energy security should 
be integral to foreign policy, given the high 
costs and long lead times of energy develop- 
ment. But I would argue that, particularly 
in rich countries, energy security depends 
more on using fuel and electricity rationally. 

More important than OPEC’s manoeu- 
vrings is our continuing reliance on hun- 
dreds of millions of inexcusably inefficient 
vehicles, our preference for poorly insu- 
lated houses, our often mindless mobility 
and our consumption of energy-intensive 
junk. And as for the rapidly modernizing 
countries, is China’s only choice to copy the 
US model of mass car ownership? 

Yergin makes no comparisons of what 
nations actually do with energy — for 
instance, how much they need to secure a 
decent quality of life. Poor people in devel- 
oping countries obviously need more energy, 
but how much more? As much as is already 
consumed, per capita, by their urban com- 
patriots? Or, eventually, as much as in the 
United States, where the usage per head is 
twice as high as that in the richest European 
countries? 

The book is silent on these matters. 
Instead, Yergin concludes that “this quest 
for energy goes without end”. But it cannot 
—and should not. m 


rationally.” 


Vaclav Smil is an energy scientist and 
professor in the Faculty of Environment, 
University of Manitoba, Winnipeg, Canada. 
e-mail: vsmil@cc.umanitoba.ca 
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IN RETROSPECT 


Normal Accidents 


As Japan strives to overcome the Fukushima nuclear disaster, Nick Pidgeon reflects 
on Charles Perrow’s classic book about why complex technologies fail. 


it was with Charles Perrow’s influential 

book Normal Accidents. Its publication 
in 1984 was followed by a string of major 
technological disasters — including the 
Bhopal industrial chemical leak in India 
in December 1984, the explosion of the US 
space shuttle Challenger in January 1986, and 
the Chernobyl] nuclear accident in Russia in 
April that year. Each cried out for the sort 
of detailed analysis that Perrow supplied. 
Now, more than a year after the Deepwater 
Horizon oil-rig blowout in the Gulf of Mexico, 
and in the aftermath of the nuclear disaster 
at Fukushima Daiichi in Japan in March, the 
book’s message seems again prescient. 

Normal Accidents contributed key con- 
cepts to a set of intellectual developments 
in the 1980s that revolutionized how we 
think about safety and risk. It made the case 
for examining technological failures as the 
product of complex interacting systems, and 
highlighted organizational and manage- 
ment factors as the main causes of failures. 
Technological disasters could no longer be 
ascribed to isolated equipment malfunction, 
operator error or random acts of God. 

As one of the foremost US authorities on 
the sociology of large organizations, Per- 
row admits that he came to the topic of risk 
and technology almost by mistake. He was 
invited to provide a background paper for 
the President’s Commission On The Acci- 
dent At Three Mile Island, which enquired 
into the 1979 nuclear 
incident near Harris- 
burg, Pennsylvania. 
A very small leak of 
water into an instru- 
mentation system had 
triggered an escalat- 
ing chain of events at 
the Three Mile Island 
plant, involving both 
component malfunc- 
tions and operator 
errors. The result was 
a major loss of cool- 
ant to the reactor, 
not unlike the recent 
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Radiation monitoring at Three Mile Island in 1979. 


Mile Island accident, nor of the Fukushima 
disaster. Numerous, seemingly incon- 
sequential difficulties that had not been 
predicted by the plant designers combined to 
defeat multiple safety systems. Perrow con- 
cluded that the failure at Three Mile Island 
was a consequence of the system’s immense 
complexity. Such modern high-risk systems, 
he realized, were prone to failures however 
well they were managed. It was inevitable 
that they would eventually suffer what he 
termed a ‘normal accident. 

Therefore, he suggested, we might do better 
to contemplate a radical redesign or, if that 
was not possible, to abandon such technolo- 
gies entirely. Foreseeing one of the problems 
at Fukushima, Perrow wrote in 1984 that 
“nuclear plants could be made marginally 
less complex if the spent storage pool were 
removed from the premises”. Because such 
pools typically require constant cooling and 
attention, a reactor accident forcing an evac- 
uation of the building, or a complete loss of 
power to the fuel cooling system, would then 
risk a serious fuel fire and significant release 
of radiation from the storage ponds. 

Normal Accidents introduced two 
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concepts: ‘interactive complexity, meaning 

the number and degree of system inter- 
relationships; and ‘tight coupling’ or 
the degree to which initial failures can 
concatenate rapidly to bring down other 
parts of the system. Universities, for exam- 
ple, are interactively complex but only 
loosely coupled — decisions are often influ- 
enced by unanticipated factors but effects 
are felt slowly. By contrast, modern produc- 
tion lines are often tightly coupled, with 
close and rapid transformations between 
one stage and the next, but have simple 
relationships between those stages. Neither 
tends to suffer systemic accidents. 

When systems exhibit both high com- 
plexity and tight coupling, as at Three Mile 
Island, the risk of failure becomes high. 
Worse still, according to Perrow, the addition 
of more safety devices — the stock response 
to a previous failure — might further reduce 
the safety margins if it adds complexity. For 
example, when a British European Airways 
Trident jet crashed with the loss of all lives 
near London Heathrow Airport in 1972 after 
it stalled during take-off, the pilots were una- 
ble to diagnose the fault amid at least nine 
other cockpit warnings and alarms that went 
offasa result. 


AHISTORY OF FAILURE 
Six years before Normal Accidents, Brit- 
ish sociologist Barry Turner published his 
analysis of 80 major UK system failures in the 
lesser-known but similarly influential book 
Man-Made Disasters (Wykeham Science 
Press, 1978; I contributed to the posthumous 
1997 second edition). Turner, too, empha- 
sized the ways in which system complexity 
can defeat attempts to anticipate risks. But his 
theory differed crucially from Perrow’s — it 
gave a more acute description of the organi- 
zational, management and communication 
failings that occur before an accident. 
Major accidents do not spring into life 
on the day of the visible failure; they have a 
social and cultural context and a history. The 
problems at Three Mile Island, as we now 
know, were foreshadowed by similar near- 
miss events in other US pressurized-water 
plants — notably at the Davis-Besse nuclear 
plant near Oak Harbor, Ohio, in 1977. This 
raises the question of why safety information 
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and learning were not shared among 
operators. Disaster analysis of man-made 
incidents — in contrast to normal accidents 
— implied that if the background conditions 
incubated over time, there was some possi- 
bility of prior detection even when systems 
were complex. 

Tensions are evident in the comparison of 
Turner's and Perrow’s accounts — between 
the possibilities for foresight and fatal- 
ism. These resurfaced several years later in 
debates among US scholars over ‘normal 
accidents’ versus ‘high-reliability organi- 
zations. Similar discussions were seen in 
related work in Europe on safety culture and 
organizational accidents. The fundamental 
question, posed by influential political sci- 
entist Scott Sagan in his book The Limits of 
Safety (Princeton University Press, 1993), 
was: are normal accidents inevitable or can 
the combination of interactive complexity 
and tight coupling be safely managed? 

The high-reliability researchers believed 
that it could. They studied cases such as the 
flight operations on aircraft carriers, where 
the conditions for normal accidents exist 
but the systems operate safely each day. 
They identified cultural factors, such as col- 
lective decision-making and organizational 
learning, as key reasons why an otherwise 
toxic combination of complexity and risk 
can be managed. By contrast, critics such as 
Sagan pointed out that even these systems 
had serious near-misses from time to time, 
and that normal accidents could always 
occur as a result. 

That debate is still unresolved. Never- 
theless, the analyses of Perrow and Turner 
were ahead of their time and their legacy 
remains profound. Many subsequent acci- 
dent inquiries drew on their insights — most 
notably the space shuttle Columbia Accident 
Investigation Board report in 2003. Enquir- 
ies into the Fukushima disaster will benefit 
too, but we need a wider appreciation of 
how future normal accidents might gestate, 
and a better understanding of the actions of 
organizations and people, both intended and 
unintended, that generate major risks. 

The world still faces many systemic risk 
challenges, including those of runaway 
climate change, financial-market failure 
and information security. Although many 
advances in safety technology, engineering 
practice and risk management have been 
made over the past 30 years, organizational 
and technical complexity remain integral 
to the many systems that drive such risks. 
Normal Accidents is a testament to the 
value of rigorous thinking when applied 
to a critical problem. = 


Nick Pidgeon is professor of applied 
psychology at Cardiff University, Wales, 
CF10 3AT, UK. 

e-mail: pidgeonn@cardiff.ac.uk 


Books in brief 


Lifeblood: How To Change The World, One Dead Mosquito at a Time 
Alex Perry C. HURST 208 pp. £16.99 (2011) 

Journalist Alex Perry chronicles two years of US philanthropist Ray 
Chambers’s crusade against malaria. A mix of science, history and 
research, this is a fascinating take on a disease that kills a million 
people a year. Chambers’s story is just as intriguing. Pragmatism, 
business sense and bullheadedness gave him an advantage over 
the formulaic and often cost-ineffective approaches of many aid 
agencies. His Wall Street clout helped him to bring world leaders 
on board. And his focus on solutions such as bed nets and poverty 
eradication has, says Perry, enabled him to save millions of lives. 


The Third Industrial Revolution: How Lateral Power Is 
Transforming Energy, the Economy, and the World 

Jeremy Rifkin PALGRAVE MACMILLAN 304 pp. £16.99 (2011) 

Green energy and the Internet will revolutionize society and 
environment. So argues economist Jeremy Rifkin in this blueprint 
for global change. The five pillars of his vision for a post-fossil 
economy are a shift to renewables; a miniature power plant in every 
building; hydrogen storage for intermittent energy; an ‘intergrid’ 
for sharing energy that harnesses Internet technology; and eco- 
transport that runs on plug-in electricity and fuel cells. With the 
European Union already on board, this is a big idea with backbone. 


Redirect: The Surprising New Science of Psychological Change 
Timothy D. Wilson LITTLE, BROWN 288 pp. $25.99 (2011) 

The stories we tell ourselves shape our lives, says social psychologist 
Timothy D. Wilson. Editing them can help us to redirect our thoughts 
and actions. He trawls through multitudinous ‘happiness formulae’ 
in popular-psychology books to show what doesn’t work — from 
critical incident stress debriefing for trauma victims to the Healthy 
Families America initiative to prevent child abuse. What does work, 
he says, is simpler, subtler and backed by sound research: writing 
exercises, ‘rewind’ therapy, helping children to develop healthy 
narratives of their own and practising tolerance. 


Invasion of the Body: Revolutions in Surgery 

Nicholas L. Tilney HARVARD UNIVERSITY PRESS 384 pp. $29.95 (2011) 
Tumours removed, joints replaced, organs transplanted: every 
weekday, 85,000 non-emergency operations take place in the 
United States alone. Distinguished US surgeon Nicholas L. Tilney 
intersperses moments from his own career with a rousing history 
of the evolution of surgery, breakthrough by breakthrough — from 
near-butchery to today’s fine-tuned procedures. Wading through the 
gore with aplomb, he covers anaesthesia, pharmaceuticals, asepsis, 
health-care reform, surgery in war and in peace, facial transplants 
and more. 


About Time: Cosmology and Culture at the Twilight of the Big Bang 
Adam Frank FREE PRESS 432 pp. $26 (2011) 

In this eloquent book, physicist and astronomer Adam Frank 
explores the interweaving of social and cosmological time. His 

trek through the history of humanity takes a parallel look at how 

we have gained a deeper grasp of the Universe during our time on 
Earth. Starting at the dawn of consciousness, he brings us through 
millennia of revolutions — from the development of agriculture, 
industry and the rise of cities to the emergence of string theory and 
ideas about eternal inflation and the multiverse. 
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remains profound. Many subsequent acci- 
dent inquiries drew on their insights — most 
notably the space shuttle Columbia Accident 
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ies into the Fukushima disaster will benefit 
too, but we need a wider appreciation of 
how future normal accidents might gestate, 
and a better understanding of the actions of 
organizations and people, both intended and 
unintended, that generate major risks. 

The world still faces many systemic risk 
challenges, including those of runaway 
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and information security. Although many 
advances in safety technology, engineering 
practice and risk management have been 
made over the past 30 years, organizational 
and technical complexity remain integral 
to the many systems that drive such risks. 
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The SubSafe software creates a virtual submarine to train personnel to find safety equipment. 


Q&A Robert Stone 
The virtual trainer 


Robert Stone, director of the Human Interface Technologies Team at the University of 
Birmingham, UK, develops ‘serious games’ for training soldiers and sailors. He explains 
how immersion in virtual worlds can prepare military personnel for their experiences on the 
battlefield — and help them to heal after they return. 


How useful are virtual environments in 
military training? 

Even with today’s technology, we can't fully 
replace the experience of live combat or 
military activities. But where current train- 
ing practices fall short, virtual environments 
or serious gaming can fill the gap. 


What sort of environments do you model? 
We train a range of military recruits — from 
new submariners to those who will patrol 
areas to locate improvised explosive devices 
in Afghanistan and the United Kingdom. 
One project involves a virtual British town. 
We program it for different scenarios — say, 
a suspect package left at a railway station. 
Army instructors work with trainees to pick 
and place objects, such as the command 
vehicle, cordons or robot disposal vehicles; 
they then discuss appropriate questions to 
ask the police or witnesses, to build up a 
strong picture of the developing incident. 


Is virtual training proven to work? 

Yes. For example, SubSafe is a three-dimen- 

sional recreation of a nuclear submarine, 
built using game- 


> NATURE.COM technology software. 
Fora Naturevideoof Naval instructors use 
Robert Stone’swork: it to train submariners, 
go.nature.com/puuxso particularly in locating 


safety equipment — fire extinguishers, 
breathing apparatus and thermal imagers. 
When we evaluated the software with train- 
ees at the naval base in Devonport, UK, over 
18 months, we found that there was a sig- 
nificant improvement in the real-world per- 
formance of the guys who got the software, 
compared with those who did not. That’s 
important because in the future, the United 
Kingdom will have fewer submarines, which 
means less time for trainees on real vessels. 


Robert Stone and his training program SubSafe. 
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BEYOND THE BOMB 


Science and the military 
nature.com/military 


Can your work help soldiers who return from 
combat? 

We plan to use the technology to provide 
restorative environments for return- 
ees who have had serious surgery, such 
as amputations. Display screens at the 
ends of their hospital beds would show 
a view of sea- and landscapes in real 
time: evidence suggests that exposure to 
blue-green environments — such as sea- 
scapes and meadows — improves recov- 
ery and reduces the need for analgesics. 
Virtual environments will be used as 
part of therapeutic ‘guided imagery’ and 
relaxation techniques. We also aim to 
help amputees to prepare their remain- 
ing muscles for prosthetic devices, using 
motion-sensing devices in tandem with 
virtual environments — so that they can 
virtually walk along a beach, scuba-dive, 
skim stones or throw a frisbee. 


What are the challenges for an academic 
working with the military? 

It is a difficult area to break into. These 
projects are not something you can per- 
form in splendid academic isolation; you 
have to go into the field and understand 
the jobs that these guys do. I’ve flown on 
helicopters, been to sea in submarines and 
fired small Gatling guns off the side of 
Royal Navy ships. There are risks — I once 
experienced a helicopter gearbox failure, 
and have been marooned on the bottom of 
a Scottish loch in a damaged three-person 
submersible. 


Does it help that most recruits are familiar 
with computer games? 

Yes; the games culture is helping us to 
manage the expectations and engagement 
of future military-simulation users. Many 
young recruits have been put in front of 
simulators using older technologies, and 
have said: “We can do better than this at 
home.’ That way, you lose hearts and minds 
instantly. But if you show them something 
that looks the part, such as the Call of Duty 
series of action games, you instantly have 
their attention. 


Are some people still sceptical? 

That never goes away. It’s not just the older 
members of the armed forces; we get it from 
younger officers as well. Fortunately, when 
they have seen the results, most of them 
come around very quickly. It is all part of 
the challenge. m 


INTERVIEW BY DANIEL CRESSEY 


Correspondence 


US pathology centre 
units will live on 

We wish to point out that several 
elements of the US Armed Forces 
Institute of Pathology (AFIP) 
have survived its closure and 
have been relocated within the 
Department of Defense (Nature 
476, 270-272; 2011). 

These units include the 
Depleted Uranium and 
Imbedded Fragment Laboratory, 
the Molecular Laboratory, 
Telepathology, the Automated 
Central Tumor Registry, 
the Veterinary Pathology 
Program (including residency 
training), the Armed Forces 
Medical Examiner function, 
the Histotechnology Training 
Program, and the congressionally 
funded Combat Wound Initiative. 
The Department of Veterans’ 
Affairs has assumed responsibility 
for the capabilities of the 
Biophysical Research Laboratory. 

The Department of Defense 
is working to make the vast 
collection of the former AFIP 
Tissue Repository (now part 
of the Joint Pathology Center) 
broadly available for research. 
At our request, the Institute 
of Medicine has convened a 
panel of national experts in 
biorepository management, 
medical informatics, medical 
ethics and pathology. The 
panel's task is to recommend the 
optimal and sustainable use of 
repository material; who should 
have access to it; technologies 
needed to utilize the repository; 
and ethical considerations over 
the use in research of material 
originally collected for clinical 
purposes. 

Several institutes are 
collaborating in pathology 
translational research and in 
supporting key clinical-research 
initiatives and education efforts. 
These include the Uniformed 
Services University of the 
Health Sciences, the Joint Task 
Force National Capital Region 
Medical (and its subordinate 
units, the Joint Pathology Center 
and Walter Reed National 
Military Medical Center), other 


organizations in the Department 
of Defense, and Veterans Affairs. 
These collaborations 
will be part of a new era 
of intergovernmental and 
public-private partnerships 
that will create vital research 
and clinical interactions. The 
celebrated history of AFIP and 
its importance to the broader 
research, clinical and academic 
communities provide the perfect 
foundation. 
Thomas P. Baker Joint Pathology 
Center, Silver Spring, Maryland, 
USA. thomas.p.baker@us.army.mil 
John M. Mateczun Joint Task 
Force National Capital Region 
Medical, Bethesda, Maryland, 
USA. 
Charles L. Rice Uniformed 
Services University of the Health 
Sciences, Bethesda, Maryland, 
USA. 


China’s academic 
autocracy must go 


Many scientists in China share 
Nai-Xing Wang’s dissatisfaction 
with the dominant role of journal 
impact factors in the country’s 
scientific evaluation system 
(Nature 476, 253; 2011). But I 
contend that even an imperfect 
law is better than no law. 

Replacing this rigid evaluation 
system with a more flexible 
one could send Chinese 
academia into chaos. Leaders 
of universities and research 
institutions could then establish 
their own evaluation systems, 
designing them to favour their 
particular interests. For example, 
a professor who is connected 
to a scientific journal might 
be tempted to rank papers 
published in that journal 
more highly when evaluating 
the performance of his or her 
university. 

Chinese researchers 
should benefit from the strict 
implementation of impact-factor 
evaluation criteria. But the 
rewards for meeting these targets 
arent always forthcoming. 
A good relationship with the 
few leading executives who 


control China’s academia is also 
important, as it is for gaining 
access to the best scientific 
projects and for promotions. 
The key task is therefore to 
eradicate this autocratic control. 
Researchers would then be able to 
concentrate solely on their work. 
Nai-Zhuo Zhao Northeast 
Normal University, Changchun, 
China. 
naizhuo.zhao@gmail.com 


Review boards: vital 
to protect subjects 


On behalf of the Consortium of 
Independent Review Boards, a 
non-profit US organization for 
ethical review of clinical research 
and protection of participants, 

I object to your suggestion that 
a US government proposal to 
overhaul institutional review 
board (IRB) regulations would 
increase the use of commercial 
IRBs that have an “unsettling 
incentive to approve trials” 
(Nature 476, 125; 2011). 

You imply that independent 
IRBs put research subjects at risk. 
However, all review boards are 
subject to a high level of federal 
regulation and inspection. 
Inspections by the US Food 
and Drug Administration 
involve thorough site visits 
and assessment of policies, 
procedures and records. 

Independent IRBs protect 
research participants by 
reviewing studies conducted by 
private clinics and community 
and academic hospitals. Without 
such IRBs, patient access 
to promising experimental 
treatments would be curtailed 
and research would be unduly 
protracted. 

Consortium members 
commit to a code of ethics 
requiring them to protect the 
IRB from economic influences 
when reviewing research, 
minimize ‘IRB shopping’ and 
promote ethical marketing. 

The consortium also requires 
members to be accredited by 
independent bodies such as the 
Association for the Accreditation 


of Human Research Protection 
Programs (see Nature 477, 280; 
2011). 

The consortium believes 
that the proposed overhaul 
of the regulations warrants 
careful review, and urges the 
research community to focus on 
identifying substantive measures 
that support the highest 
standards for protecting human 
subjects in clinical research. 
Cami Gearhart Consortium 
of Independent Review Boards, 
Washington DC, USA. 
www.consortiumofirb.org 


Make integrity key 
to recruitment 


Far from being a vague ideal, the 
complex and sensitive issue of 
maintaining integrity in science is 
a critical imperative. In my view, 
it would help to demand and 
monitor integrity in scientists 
and managers from the outset 
(Nature 476, 251, 262; 2011). 

Most researchers know from 
their training that honesty 
is fundamental to scientific 
integrity. But some managers 
and agency officials can find 
themselves in difficult situations. 
A manager must cope with 
the competing pressures of 
supporting and protecting the 
scientists working ona project, 
ensuring the survival of the 
scientific institution and pleasing 
unforgiving political masters — 
possibly all under public scrutiny. 
Even an honest manager might 
fear being undermined by a rival 
colleague or, worse, by a scientist 
who is cavalier about professional 
ethics. 

The only way to achieve 
scientific integrity across the 
board is to ensure that personal 
and professional values (as 
well as knowledge and skills) 
are primary criteria for the 
employment of both scientists 
and managers. These values must 
be demonstrated and constantly 
monitored, not just presumed. 
Alfred P. Zarb Leura, New South 
Wales, Australia. 
zarbap@ozemail.com.au 
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FORUM Ageing 


Longevity hits a roadblock 


Increased expression of sirtuin proteins has been shown to enhance lifespan in several organisms. New data indicate that 
some of the reported effects may have been due to confounding factors in experimental design. Here, experts discuss the 
significance of these data for research into ageing. SEE LETTER P.482 


THE PAPER IN BRIEF 

@ Previous work’ had shown that increased 
expression of the yeast protein Sir2 and of 
related sirtuin proteins in the nematode 
Caenorhabditis elegans and the fruitfly 
Drosophila melanogaster extends lifespan. 

@ Burnett et a/.° believed that the organisms’ 
genetic backgrounds, because they had not 
been controlled, might have affected those 
results, and so they repeated the experiments 
after standardizing the backgrounds. 

@ By means of genetic backcrossing (Fig. 1), 
the authors show that sirtuin overexpression 
and longevity in C. elegans are separable. 


A valuable 
background check 


Bee and colleagues’ in-depth study” of 
the influence of Sir2 overexpression on 
lifespan in C. elegans and Drosophila, as wellas 
on diet-modulated longevity in the fly, shows 
that the protein has no effect, challenging pre- 
vious publications’**. But since when does 
Nature publish negative results? 

Well, sirtuins are exceptional. Anything 
that can influence ageing captures the pub- 
lic’s imagination, and — since the reports 
that sirtuins are involved in increasing life- 
span in invertebrates — research into these 
proteins has exploded. Moreover, the seven 
mammalian sirtuins have been implicated in 
the suppression of numerous age-associated 
diseases, including neurodegenerative dis- 
orders, cardiac dysfunction, hearing loss 
and neoplasia’. 

Meanwhile, heated controversies have arisen 
regarding aspects of sirtuin biology. These 
involve the proteins’ roles in the response to 
dietary restriction; the mechanism of action 
of reported sirtuin activators; and whether 
the role of Sir2 in promoting longevity — first 
identified in yeast — is evolutionarily con- 
served in other organisms. The disagreements 


@ Two Drosophila strains that had been 
reported to be long-lived as a result of sirtuin 
overexpression were not long-lived when 
compared with a genetically appropriate 
control strain. 

@ The study refutes yet another previous 
observation — that the increase in fly lifespan 
that occurs with dietary restriction depends 
on sirtuins. 

@ In an accompanying Brief Communication 
Arising, Viswanathan and Guarente® show 
that sirtuin overexpression leads to a 
smaller increase in C. elegans lifespan than 
previously reported. 


fostered complaints of publication bias and 
disproportionate scientific and commercial 
emphasis on the role of sirtuins in longevity. 
So, here we are. 

Burnett et al.° were thorough. They 
generated an impressive collection of lifespan 
experiments, and these were replicated and 
the results contributed by several laboratories. 
The data were supported by backcrossing and 
by genetic and biochemical analyses. But an 
accompanying report from Viswanathan and 
Guarente’, although confirming some of Bur- 

nett and co-workers’ 
findings, shows that 


Challenging worms overexpressing 
published Sir2 retain modestly 
results is an increased (10-13%) 
essential, lifespan after back- 
self-correcting crossing — consistent 
aspect of with data reported 
science. elsewhere®. 


At best, therefore, 

these papers indi- 

cate that Sir2 overexpression is just one of 

more than 100 genetic manipulations cur- 

rently known to increase worm and/or 

fly lifespan to some degree, with many others 
having larger effects. 

Given the demonstrated importance 
of sirtuins in mammals, why rehash the 
precise role of Sir2 in worm and fly age- 
ing? First, challenging published results 
is an essential, self-correcting aspect of 
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science. Second, invertebrate models 
continue to contribute to the understanding 
of sirtuin biology in mammals, and so future 
studies must be interpreted in the context 
of these new data. Third, the new reports”® 
reinforce the importance of rigorous genetic 
background control when interpreting the 
effects of single gene mutations. Studies in 
Drosophila suggest that 8-10 generations of 
backcrossing are required before the genetic 
background can be considered reasonably 
well controlled; this needs to be the standard 
in worms too (Fig. 1). For genetically engi- 
neered mutant mice, where this approach 
may be impractical, analysis of independ- 
ent transgenic animal lines and/or reversal 
of traits upon re-expression of the gene of 
interest should be undertaken to ensure that 
that gene is indeed responsible for the effect 
under investigation. 

Perhaps the most interesting issue is the 
apparent disconnect between these new 
results and the powerful effects of sirtuins on 
age-associated disease in mammals. Maybe 
sirtuins have their strongest effects on specific 
aspects of physiological homeostasis and stress 
responses, rather than in modulating ageing 
per se. For example, consistent with published 
data’, Burnett et al.° show that Sir2 overex- 
pression makes worms resistant to toxic pro- 
tein aggregates. Also, increased expression of 
the related SIRT 1 protein in mice suppresses 
metabolic dysfunction and the development 
of certain types of cancer, without increasing 
overall lifespan’’. Although roles for other 
mammalian sirtuins in promoting longevity 
have not been explored, such studies are 
under way in several laboratories. So the latest 
reports” are unlikely to be the final word on 
sirtuins and ageing. 


David B. Lombard is in the Department 
of Pathology and Institute of Gerontology, 
and Scott D. Pletcher is in the 
Department of Molecular and Integrative 
Physiology and Institute of Gerontology, 
University of Michigan, Ann Arbor, 
Michigan 48109, USA. 

e-mails: davidlom@umich.edu; 
spletch@umich.edu 


Don’t write 
sirtuins off 


CARLES CANTO & JOHAN AUWERX 


he role of Sir2 and its mammalian counter- 

part SIRT1 as lifespan regulators, pro- 
posed on the basis of work done exclusively in 
simple organisms, has been grounds for inten- 
sive discussion. Studies on yeast lifespan were 
the first to cast doubt on the role of sirtuins 
in longevity’. And Burnett and colleagues’ 
elegant work® puts a final nail in the coffin. 
Let’s not forget, however, that an overwhelm- 
ing body of evidence indicates that sirtuins 
have crucial roles in metabolic homeostasis. 

The new paper, indeed, dampens the lon- 
gevity claims assigned to Sir2. But one caveat of 
most of the genetic work in simple organisms 
is that it is rarely accompanied by biochemi- 
cal studies that reveal the extent to which the 
manipulations of a gene or protein are trans- 
lated into altered activity. Sirtuins are pro- 
tein deacetylase enzymes, and so in this case 
it remains unclear how manipulation of the 
gene encoding Sir2 changes the acetylation 
state of its main target proteins. Such infor- 
mation would establish beyond doubt that the 
absence of the Sir2 gene is not compensated for 
by increased activity of another member of the 
sirtuin family. 

As for the relevance of these data*® to the 
role of sirtuins in mammals, it was previously 
shown that mice overexpressing SIRT1 do 
not live longer”. Initially, this lack of effect on 
lifespan was attributed either to insufficient 
expression of the SIRT1 transgene or to pos- 
sible confounding functional compensation 
of protein deacetylation by other Sir2-related 
proteins. However, if Burnett and colleagues’ 
results can be extended to mice, then the earlier 
mouse data” are not so astonishing after all. 
The main role of SIRT 1 in mammals might not 
be directly related to lifespan regulation. 

If not to control lifespan, what is the main 
function of SIRT 1? The answer may lie in the 
observation that the activity of SIRT 1 depends 
strictly on the levels of the coenzyme NAD’, 
which acts as a co-substrate for the deacetyla- 
tion reaction that SIRT1 catalyses. Changes in 
NAD* levels, which reflect the metabolic activ- 
ity of the cell, hence modulate SIRT 1 activity 
according to the cellular energy status. This 
NAD* dependence makes SIRT1 an attrac- 
tive integrative node for metabolic sensing 
and for transcriptional regulation, as SIRT-1 
influences post-translational modification of 
transcription factors and histone proteins’. 

In contrast to the observations on lifespan, 
the effects of SIRT 1-related proteins favour- 
ing metabolic flexibility are based on a wealth 
of genetic, physiological and pharmacologi- 
cal evidence, and are conserved from yeast 
to mammals. In mammals, SIRT1 mediates 
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Figure 1 | Eliminating effects of confounding genetic sequences. Burnett et al.° show that genetic 
backcrossing is essential to ensure that comparisons between treatment and control strains of an organism 
are not confounded by differences unrelated to the gene of interest (pink). Confounding effects may be due 
to a large effect of a nearby gene (orange) or small effects of many genes throughout the genome (occurring 
within the purple elements). At generation 0 (GO, no backcrossing), these influences predominate. With 
backcrossing, processes such as independent chromosome segregation and meiotic recombination 
introduce variants from the control strain (blue elements) into the treatment strain, reducing the possibility 
of confounding effects. Although most differences between the two strains are rapidly lost (at a rate of 
roughly 50% per generation), closely linked differences often persist for many generations. It is only after 
recombination occurs sufficiently closely to the gene of interest (at G10 or so) that genetic backgrounds can 
be considered properly controlled and that differences can be attributed to the gene of interest. 


metabolic and transcriptional adaptations 
to situations of energy stress and nutrient 
deprivation by enhancing respiration by the 
mitochondria, the cell’s energy producers. 
Mice overexpressing SIRT1 are therefore pro- 
tected from the meta- 
bolic damage caused by 


One caveat a high-fat diet"®. 

of most of Similarly, indirect 
the genetic activation of SIRT1 
workinsimple by the compound 
organisms is resveratrol protects 
thatitisrarely against metabolic and 
accompanied age-related diseases”’, 
by biochemical curbing the lifespan 
studies. reduction induced by 


high-calorie diets”, 

even though it has no 
effect on lifespan in mice fed regular chow”. 
Conversely, outbred mice lacking SIRT1 
show deficiencies in metabolism and cannot 
respond properly to the lifespan-increasing 
effects of calorie restriction’. This highlights 
how the metabolic adaptations that SIRT1 
induces might indirectly influence mammalian 
lifespan. 

Indeed, although in light of Burnett and 
colleagues’ findings” the appeal of sirtuins 
as a sensu stricto lifespan determinant might 
be gone, SIRT 1 activation remains a promis- 
ing approach to delaying general age-related 
physiological decline and to treating numerous 


inherited and acquired diseases character- 
ized by defective mitochondrial function. 
The astonishing ability of SIRT1 to enhance 
‘healthspan’ by promoting metabolic fitness 
will also guarantee it a long life as a subject for 
further exciting research. m 


Carles Canto is at the Nestlé Institute 

of Health Sciences, CH- 1015 Lausanne, 
Switzerland. Johan Auwerx is in the Ecole 
Polytechnique Fédérale de Lausanne, 
CH-1015 Lausanne, Switzerland. 

e-mail: johan.auwerx@epfl.ch 
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Slippery when wetted 


The slick interior of the pitcher plant has inspired a slippery material possessing 
self- lubricating, self-cleaning and self-healing properties. The secret is to infuse 
a porous material with a liquid that repels oils and water. SEE LETTER P.443 


MICHAEL NOSONOVSKY 


he legendary water repellence of lotus 
| leaves has inspired a field of research 
aimed at making similarly ‘super- 
hydrophobic’ surfaces. But it’s much more 
difficult to make omniphobic materials, 
which repel oils as well as water. On page 443 
of this issue, Wong et al.’ report a radical new 
approach to making omniphobic surfaces that 
was inspired by another member of the plant 
family: the insect-eating pitcher plant. 

An ancient Indian poem, the Bhagavad Gita, 
has this to say about a seeker of truth: “Having 
abandoned attachment, he acts untainted by 
evil, just as a lotus leaf is not wetted.” Here, as 
in many cultures, the lotus is used as a symbol 
of purity because of its ability to emerge clean 
from muddy water. Examination of lotus leaves 
in the 1990s using scanning probe microscopy” 
revealed that this ability is a result of the leaves’ 
surface microstructure: each leaf is covered 
with tiny bumps called papillae. When the sur- 
face is wetted by water, a so-called composite 
solid—air—liquid interface forms in which water 
sits atop pockets of air trapped between the 
papillae (Fig. 1a). This drastically reduces the 
solid—water contact area, so that water drop- 
lets form an almost perfect sphere and easily 
roll on the surface, washing away dust in the 
process. This superhydrophobic behaviour is 
often referred to as the lotus effect. 


Different models that connect surface rough- 
ness with water wetting have been proposed, 
including the Cassie—Baxter model? (which 
describes a three-phase solid-air-liquid 
composite interface; Fig. 1a), and the Wenzel 
model* (which invokes a simpler two-phase 
system in which no air pockets are trapped 
between the solid and the water; Fig. 1b). The 
Cassie-Baxter model explains superhydro- 
phobicity, and has helped in the development 
of techniques for structuring the surfaces of 
different materials to mimic the lotus effect. 
These materials find applications in the field 
of tribology — the study of friction, wear 
and lubrication — and in other areas of 
engineering. 

Making superhydrophobic surfaces is a 
challenge, but it is far more difficult to pro- 
duce oleophobic surfaces that repel organic 
liquids such as oils. This is because oil mol- 
ecules are nonpolar, and have a much lower 
surface energy than polar water molecules; it 
is therefore not energetically favourable for oil 
droplets to form as spheres on a solid surface. 
Omniphobic surfaces that are both oleo- 
phobic and hydrophobic are highly desirable 
for many applications — they could, for exam- 
ple, be used to prevent dirt from collecting on 
optical devices or to prevent moving parts in 
micrometre-scale devices from sticking to each 
other. Previous approaches for making oleo- 
phobic and omniphobic surfaces have involved 


Lubricant 


Figure 1 | Forms of surface wetting. a, The Cassie-Baxter model of surface wetting proposes that water 
droplets sitting on rough surfaces form a solid—air—water interface. Air pockets trapped beneath the 
droplet reduce the contact between the water and the surface. If the surface features are of the right size 
and are regular, then the surface becomes water repellent. b, In the Wenzel model, no air pockets form, 
and so the surface is completely wetted by the droplet. c, Wong et al.' have made porous materials that 
are infused with, and wetted by, a liquid lubricant that repels water and oils. The surface features do not 
themselves generate water repellence, but serve to hold the lubricant in place. 
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the design of complicated surface geometries 
to prevent oil from penetrating into valleys 
between bumps’. 

Wong et al.' suggest a new approach, 
inspired by the surface of the insect-eating 
pitcher plant Nepenthes. The plant captures 
prey using a slippery, water-lubricated surface 
— insects that step on the surface at the rim of 
the pitcher slide down into digestive juices at 
the bottom®. The surface is slippery because 
the lubricant forms a continuous film that 
repels oils on the insects’ feet. Although the 
surface has microstructures, these are irregular 
(unlike those of lotus-inspired surfaces), and 
serve only to hold the lubricant in place. 

The authors mimicked pitcher-plant sur- 
faces by making a sponge-like material and fill- 
ing it with a lubricating liquid to create slippery 
liquid-infused porous surfaces (SLIPS). When 
a droplet of another liquid is placed on the 
material, a composite solid—lubricant-liquid 
interface is formed (Fig. 1c). The lubricant 
has a similar function to the air pockets in 
the lotus effect, but it also forms a continuous 
film, similar to that on the surfaces of pitcher 
plants. Unlike lotus-mimicking materials, 
SLIPS can be oleophobic, and the presence of 
a lubricant means that friction at the surfaces 
is very low. In fact, by choosing lubricants that 
are immiscible with both water and oils, the 
authors prepared SLIPS that have highly prom- 
ising omniphobic properties. What’s more, the 
SLIPS could withstand high pressures, were 
wear-resistant, and even healed themselves in 
the case of minor damage — all of which are 
advantages over lotus-mimicking materials. 

The development of SLIPS typifies two 
themes that are likely to dominate the field of 
biomimetic and functional surfaces in coming 
years. The first is the integration of self- 
healing, self-lubricating and self-cleaning 
capabilities into surfaces. Friction and wear are 
usually viewed as causes of energy dissipation 
and material deterioration, but under certain 
circumstances they can lead to increased order 
at interfaces’. This can form the basis of the 
capabilities mentioned above. Indeed, this 
is what happens with the SLIPS — when the 
porous material in SLIPS is damaged by wear 
or impact, a combination of effects (chemical 
potential, concentration and pressure gradi- 
ents) facilitates the lubricant’s transport to the 
surface, restoring the materials’ self-lubricating 
and self-cleaning properties. 

The second theme is the idea that the wetting 
of rough surfaces can be more complex than 
is predicted by either the two-phase Wenzel 
model or the three-phase Cassie—Baxter 
model. Indeed, multi-phase interfaces involv- 
ing a variety of components — solids, oils, 
water, lubricants, air and so on — have been 
identified, and show great promise for new 
applications such as underwater oleophobicity. 

Not only are Wong and colleagues’ SLIPS of 
fundamental interest, but they will probably 
also lead to the development of new materials 


for many applications — in biomedical 
devices, for example, or as coatings to prevent 
the icing or fouling of surfaces. Currently, the 
main weakness of SLIPS is their durability, 
which is limited by how long the lubricant stays 
in the pores without evaporating or leaking. 
Another problem is that there are strict limita- 
tions on the chemical properties of the lubri- 
cants: they must be immiscible with both water 
and oil, but they should also penetrate into the 
pores of the underlying material. The authors’ 
preliminary studies into these issues are 
encouraging, but additional research is needed 
before applications will emerge. m 
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A yeast for all reasons 


Scientists have begun to overhaul a yeast’s genome to make it more stable, 
engineerable and evolvable. Remarkably, the part-natural, part-synthetic 
yeast cells function and reproduce without obvious ill effects. SEE LETTER P.471 


PETER J. ENYEART & ANDREW D. ELLINGTON 


he baker's yeast Saccharomyces cerevisiae 
is a model organism, and therefore 
one of the best-understood biological 
systems on the planet. Nevertheless, the 
Byzantine complexity of its inner workings still 
keeps bioengineers up at night, and continues 
to provide fodder for experimentation. If scien- 
tists could ‘refactor’ model organisms — that is, 
recode their genomes to be simpler and more 
amenable to human understanding and tinker- 
ing — then science and biotechnology based 
on those organisms could proceed at an accel- 
erated pace. On page 471 of this issue, Dymond 
et al.' present a major advance towards this 
end: the construction of a functional, partly 
synthetic version of the S. cerevisiae genome. 
Rewriting genomes to meet the specifications 
of humans has been a stated goal of synthetic 
biology for some time’. But the labour and 
expense involved in actually making such 
radical changes to genomes, coupled with 
uncertainties about the chances of improving 
on what nature spent billions of years perfect- 
ing, have meant that only a few serious tilts 
at genome re-engineering have been made. 
Previous noteworthy examples include the 
refactoring of 12,000 base pairs (about 30%) 
of a virus genome’ and the removal of ‘amber’ 
stop codons — nucleotide sequences that 
signal the termination of translation — from 
the bacterium Escherichia coli‘, a feat that 
should allow researchers to rewrite por- 
tions of the bacterium’s genetic code at will. 
And, of course, the genome of a Mycoplasma 
bacterium has been synthesized de novo by 


workers at the J. Craig Venter Institute in 
Rockville, Maryland, and used to infuse a 
working cell’. 

Dymond et al.' have now raised the bar by 
starting work on eukaryotes (organisms such 
as fungi, plants and animals), which have 
much larger and more complex genomes 
than bacteria. More specifically, the authors 
have replaced sections of two chromosomes 
of S. cerevisiae — 90,000 bases at the end of 
chromosome IX, and 30,000 bases at the end 
of chromosome VI — with synthetic DNA. 
Their eventual goal is presumably to replace 
the entire genome of 12 million base pairs with 
a human-designed sequence. 

To make their synthetic DNA, Dymond 
et al. entirely removed 20 regions from the 
naturally occurring yeast chromosomes. Most 
of these regions were repetitive sequences 
(which can cause DNA segments to be deleted 
or even cause chromosomes to mis-segregate), 
or sequences that were non-functional or 
redundant. The authors also recoded all genes 
longer than 500 bases to contain ‘watermarks’ 
— sequences that allow the synthetic DNA to 
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be easily differentiated from natural sequences 
using standard laboratory methods, but that do 
not change the sequences of proteins encoded 
by genes. As in the previously reported work* 
in E. coli, Dymond and colleagues’ modified 
amber stop codons in the DNA of S. cerevisiae 
so that they could be recoded in the future, for 
example to encode unnatural amino acids for 
insertion into yeast proteins. 

Astoundingly, the authors found that yeast 
cells containing the modified genome suffered 
no growth defects and displayed minimal dif- 
ferences in gene expression in comparison 
with the wild-type strain. The entire sequence 
of the artificially added DNA was faithfully 
reproduced by living cells, which is either a tes- 
tament to the robustness of human engineer- 
ing or a sign that God's fingerprints are fainter 
than creationists would have you believe. 

In addition to the changes mentioned earlier, 
Dymond and co-workers introduced sequence 
elements known as loxPsym sites after every 
non-essential gene in their synthetic DNA, 
and at several other positions. In the presence 
of an enzyme called Cre recombinase, these 
loxPsym sites stochastically recombined with 
each other, either deleting or inverting the 
intervening sequence of DNA. The authors 
were thus able to generate a vast library of 
yeast genomes, containing all manner of 
random architectures, at will. Such libraries 
could be screened or evolved to find new yeast 
strains that are better suited to living in a given 
environment. Moreover, because yeast is used 
to produce alcohol, proteins and high-value 
organic compounds, new strains generated in 
this way might prove to be useful for industry, 
in the same way that a simplified E. coli strain 
has proved to be an excellent platform for 
producing large quantities of proteins’. 

The obvious extension of Dymond and 
colleagues’ work is to rebuild the entire yeast 
genome. However, given that the currently 
completed synthetic sequences represent only 
about 1% of the whole genome, rebuilding 
the remainder is a daunting task. One issue is 
that, even though the aggregated cost of the 
materials, apparatus and consumables used in 
DNA synthesis has been steadily decreasing, 
the construction of entire genomes remains 
inordinately expensive’. 

Even more problematic is the cost of labour. 
A comparison of recent endeavours in genome 
synthesis and modification (Table 1) reveals 


TABLE 1 | LABOUR REQUIRED FOR GENOME SYNTHESIS 


Organism DNA bases synthesized DNA bases per year 
(year) (% of genome) of labour* 
T7 bacteriophage (2005)* 12,000 (30) 1,300 
Mycoplasma mycoides (2010)° 1,080,000 (100) 15,000 
Escherichia coli (2011)* 4,600,000 (100) or 28,000 (0.6)' 96,000 or 600° 

Saccharomyces cerevisiae (2011)! 120,000 (1) 2,700 
*Estimated as follows: total bases synthesized/[number of authors x 3], assuming that 3 years is the average time spent by a person on 
genome-synthesis projects. 
‘Depending on whether the entire genome or only synthetic oligonucleotides are counted. 
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for many applications — in biomedical 
devices, for example, or as coatings to prevent 
the icing or fouling of surfaces. Currently, the 
main weakness of SLIPS is their durability, 
which is limited by how long the lubricant stays 
in the pores without evaporating or leaking. 
Another problem is that there are strict limita- 
tions on the chemical properties of the lubri- 
cants: they must be immiscible with both water 
and oil, but they should also penetrate into the 
pores of the underlying material. The authors’ 
preliminary studies into these issues are 
encouraging, but additional research is needed 
before applications will emerge. m 
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A yeast for all reasons 


Scientists have begun to overhaul a yeast’s genome to make it more stable, 
engineerable and evolvable. Remarkably, the part-natural, part-synthetic 
yeast cells function and reproduce without obvious ill effects. SEE LETTER P.471 


PETER J. ENYEART & ANDREW D. ELLINGTON 


he baker's yeast Saccharomyces cerevisiae 
is a model organism, and therefore 
one of the best-understood biological 
systems on the planet. Nevertheless, the 
Byzantine complexity of its inner workings still 
keeps bioengineers up at night, and continues 
to provide fodder for experimentation. If scien- 
tists could ‘refactor’ model organisms — that is, 
recode their genomes to be simpler and more 
amenable to human understanding and tinker- 
ing — then science and biotechnology based 
on those organisms could proceed at an accel- 
erated pace. On page 471 of this issue, Dymond 
et al.' present a major advance towards this 
end: the construction of a functional, partly 
synthetic version of the S. cerevisiae genome. 
Rewriting genomes to meet the specifications 
of humans has been a stated goal of synthetic 
biology for some time’. But the labour and 
expense involved in actually making such 
radical changes to genomes, coupled with 
uncertainties about the chances of improving 
on what nature spent billions of years perfect- 
ing, have meant that only a few serious tilts 
at genome re-engineering have been made. 
Previous noteworthy examples include the 
refactoring of 12,000 base pairs (about 30%) 
of a virus genome’ and the removal of ‘amber’ 
stop codons — nucleotide sequences that 
signal the termination of translation — from 
the bacterium Escherichia coli‘, a feat that 
should allow researchers to rewrite por- 
tions of the bacterium’s genetic code at will. 
And, of course, the genome of a Mycoplasma 
bacterium has been synthesized de novo by 


workers at the J. Craig Venter Institute in 
Rockville, Maryland, and used to infuse a 
working cell’. 

Dymond et al.' have now raised the bar by 
starting work on eukaryotes (organisms such 
as fungi, plants and animals), which have 
much larger and more complex genomes 
than bacteria. More specifically, the authors 
have replaced sections of two chromosomes 
of S. cerevisiae — 90,000 bases at the end of 
chromosome IX, and 30,000 bases at the end 
of chromosome VI — with synthetic DNA. 
Their eventual goal is presumably to replace 
the entire genome of 12 million base pairs with 
a human-designed sequence. 

To make their synthetic DNA, Dymond 
et al. entirely removed 20 regions from the 
naturally occurring yeast chromosomes. Most 
of these regions were repetitive sequences 
(which can cause DNA segments to be deleted 
or even cause chromosomes to mis-segregate), 
or sequences that were non-functional or 
redundant. The authors also recoded all genes 
longer than 500 bases to contain ‘watermarks’ 
— sequences that allow the synthetic DNA to 
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be easily differentiated from natural sequences 
using standard laboratory methods, but that do 
not change the sequences of proteins encoded 
by genes. As in the previously reported work* 
in E. coli, Dymond and colleagues’ modified 
amber stop codons in the DNA of S. cerevisiae 
so that they could be recoded in the future, for 
example to encode unnatural amino acids for 
insertion into yeast proteins. 

Astoundingly, the authors found that yeast 
cells containing the modified genome suffered 
no growth defects and displayed minimal dif- 
ferences in gene expression in comparison 
with the wild-type strain. The entire sequence 
of the artificially added DNA was faithfully 
reproduced by living cells, which is either a tes- 
tament to the robustness of human engineer- 
ing or a sign that God's fingerprints are fainter 
than creationists would have you believe. 

In addition to the changes mentioned earlier, 
Dymond and co-workers introduced sequence 
elements known as loxPsym sites after every 
non-essential gene in their synthetic DNA, 
and at several other positions. In the presence 
of an enzyme called Cre recombinase, these 
loxPsym sites stochastically recombined with 
each other, either deleting or inverting the 
intervening sequence of DNA. The authors 
were thus able to generate a vast library of 
yeast genomes, containing all manner of 
random architectures, at will. Such libraries 
could be screened or evolved to find new yeast 
strains that are better suited to living in a given 
environment. Moreover, because yeast is used 
to produce alcohol, proteins and high-value 
organic compounds, new strains generated in 
this way might prove to be useful for industry, 
in the same way that a simplified E. coli strain 
has proved to be an excellent platform for 
producing large quantities of proteins’. 

The obvious extension of Dymond and 
colleagues’ work is to rebuild the entire yeast 
genome. However, given that the currently 
completed synthetic sequences represent only 
about 1% of the whole genome, rebuilding 
the remainder is a daunting task. One issue is 
that, even though the aggregated cost of the 
materials, apparatus and consumables used in 
DNA synthesis has been steadily decreasing, 
the construction of entire genomes remains 
inordinately expensive’. 

Even more problematic is the cost of labour. 
A comparison of recent endeavours in genome 
synthesis and modification (Table 1) reveals 


TABLE 1 | LABOUR REQUIRED FOR GENOME SYNTHESIS 


Organism DNA bases synthesized DNA bases per year 
(year) (% of genome) of labour* 
T7 bacteriophage (2005)* 12,000 (30) 1,300 
Mycoplasma mycoides (2010)° 1,080,000 (100) 15,000 
Escherichia coli (2011)* 4,600,000 (100) or 28,000 (0.6)' 96,000 or 600° 

Saccharomyces cerevisiae (2011)! 120,000 (1) 2,700 
*Estimated as follows: total bases synthesized/[number of authors x 3], assuming that 3 years is the average time spent by a person on 
genome-synthesis projects. 
‘Depending on whether the entire genome or only synthetic oligonucleotides are counted. 
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that DNA synthesis at the scale of the yeast 
genome will require either armies of scien- 
tists — such as the wonderful group of under- 
graduate students currently working on similar 
projects® with Dymond and co-workers — or 
new methodologies. The authors’ landmark 
work’ confirms that automated DNA synthesis 
and assembly techniques are becoming neces- 
sary, and that the total synthesis of genomes 
is likely to supersede piecemeal approaches to 
genome modification. Given a little push here 
and there from technological advances, the age 
of designer genomes is nigh. m 
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Single electrons 


take the bus 


Single-electron circuitry is a promising route for quantum information 
processing. The demonstration of single-electron transfer between two distant 
quantum dots brings this technology a step closer. SEE LETTERS P. 435 & P.439 


TAKIS KONTOS 


he realization of electronic machines 

that exploit the laws of quantum 

mechanics is a dream for many physi- 
cists. Among the different architectures 
proposed so far for meeting this goal, one very 
promising option is based on quantum dots: 
nanometre-sized electron boxes, or conduct- 
ing islands, that can comprise as little as one 
electron. As with classical electronic devices, 
the construction of such a quantum machine 
requires ‘wires’ to connect up the elements of 
the machine's internal electronic circuitry. But 
in the quantum world, making such wires is 
not a trivial matter. 

Two papers in this issue, one by Hermelin 
etal.’ (page 435) and the other by McNeil et al.” 
(page 439), demonstrate wires, or ‘buses’, 
that can carry only a single electron and inter- 
connect two distant quantum dots. These 
findings provide a building block for the 
implementation of large-scale networks of 
quantum dots, which will be necessary to 
scale-up techniques for local quantum mani- 
pulation that are currently performed only at 
the single-quantum-dot level’. 

In quantum dots, confinement can be such 
that the characteristic charging energy of the 
dot — the energy it takes to add an extra elec- 
tron to it — exceeds thermal fluctuations at 
cryogenic temperatures. In such a situation, 
known as a Coulomb blockade, electrons pass- 
ing through the quantum dot have to do so one 


by one. This fact, combined with the discrete- 
ness of the quantum dot’s energy spectrum, 
makes the dots ideal sources of single electrons’. 

The usual way to extract a single electron 
from a quantum dot is to raise the last occu- 
pied energy level of the dot to well above the 
characteristic energy (the Fermi level) of the 
electronic reservoir to which the dot is coupled. 
This can be done with the help of an electro- 
static ‘gate’ electrode. In this manner, the 
electron ‘sitting’ on the last occupied energy 
level is forced energetically to ‘fall off’ into the 
electronic reservoir; conversely, an electron can 
be absorbed from the electronic reservoir by 
lowering a previously unoccupied energy level 
below the Fermi level. Because an electron 
emitted in such a way rapidly mixes with other 
electrons in the electronic reservoir, knowledge 
of that electron’ initial electronic state will be 
deficient, and any quantum information stored 
in the electron will be lost. This explains why 
extracting an electron from a dot and capturing 
it in another one is far from trivial. 

To isolate a single electron and implement 
a single-electron bus, Hermelin et al.' and 
McNeil et al.” took an alternative approach that 
involves moving a quantum dot rather than 
acting directly on its energy levels. The basic 
idea is to distort the electrostatic potential that 
traps the dot’s electron, using an acoustic wave 
that propagates across the surface of the device 
hosting the dot. The acoustic wave, which is 
induced by a microwave pulse, allowed the 
authors to expel a single electron from the dot 
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and, subsequently, to transfer it to a receiving 
dot through an empty channel, in which the 
electron ‘surfs’ on the acoustic wave. 

Hermelin et al. and McNeil et al. demon- 
strated successful single-electron transfer 
between the dots by detecting coincident 
emission and capture at both dots. These were 
detected with devices that are routinely used 
for charge detection in quantum dots’: sensi- 
tive electrometers that are placed closed to the 
dots. The efficiency of the authors’ approach’* 
was such that it allowed, for example, McNeil 
et al.’ to reliably transfer single electrons back 
and forth between the dots over a cumulative 
distance of about 0.25 millimetres — nearly a 
macroscopic distance. Hermelin et al. went on 
to show that, after initially loading a dot with 
two electrons, it is possible to split them apart: 
one stays in the dot and the other is captured 
by a receiving dot. 

These experiments” are particularly rel- 
evant with a view to using single-electron 
buses for retrieving and distributing quan- 
tum information stored in quantum dots 
that are embedded in complex networks. It 
has been shown’® that electronic spin can 
be manipulated quantum mechanically 
with ever-increasing fidelity. It is therefore 
possible to imagine manipulating an informa- 
tion-encoding single spin in one quantum dot, 
then transporting it to another distant dot in 
the network. What’s more, by using a double 
quantum dot, one could foresee the creation 
ofan arbitrary two-electron superposition spin 
state’ and its transfer between distant quan- 
tum dots. This would pave the way for studying 
quantum entanglement of two electrons in a 
solid-state environment. 

All of these exciting possibilities offered 
by the set-ups of Hermelin et al. and McNeil 
et al. require that single-electron transfers do 
not degrade quantum information, an aspect 
that is not addressed in their work. Because 
new electron-manipulation techniques always 
come with unexpected dissipation mecha- 
nisms, it is not clear whether the electrons 
can retain their spin state, and so the encoded 
information, during their travels in the 
channel. However, recent advances in spin 
manipulation and control”® call for optimism. 
We should therefore be confident that the 
demonstrated single-electron bus will go 
quantum in the not-so-distant future. m 
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that DNA synthesis at the scale of the yeast 
genome will require either armies of scien- 
tists — such as the wonderful group of under- 
graduate students currently working on similar 
projects® with Dymond and co-workers — or 
new methodologies. The authors’ landmark 
work’ confirms that automated DNA synthesis 
and assembly techniques are becoming neces- 
sary, and that the total synthesis of genomes 
is likely to supersede piecemeal approaches to 
genome modification. Given a little push here 
and there from technological advances, the age 
of designer genomes is nigh. m 
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processing. The demonstration of single-electron transfer between two distant 
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TAKIS KONTOS 


he realization of electronic machines 

that exploit the laws of quantum 

mechanics is a dream for many physi- 
cists. Among the different architectures 
proposed so far for meeting this goal, one very 
promising option is based on quantum dots: 
nanometre-sized electron boxes, or conduct- 
ing islands, that can comprise as little as one 
electron. As with classical electronic devices, 
the construction of such a quantum machine 
requires ‘wires’ to connect up the elements of 
the machine's internal electronic circuitry. But 
in the quantum world, making such wires is 
not a trivial matter. 

Two papers in this issue, one by Hermelin 
etal.’ (page 435) and the other by McNeil et al.” 
(page 439), demonstrate wires, or ‘buses’, 
that can carry only a single electron and inter- 
connect two distant quantum dots. These 
findings provide a building block for the 
implementation of large-scale networks of 
quantum dots, which will be necessary to 
scale-up techniques for local quantum mani- 
pulation that are currently performed only at 
the single-quantum-dot level’. 

In quantum dots, confinement can be such 
that the characteristic charging energy of the 
dot — the energy it takes to add an extra elec- 
tron to it — exceeds thermal fluctuations at 
cryogenic temperatures. In such a situation, 
known as a Coulomb blockade, electrons pass- 
ing through the quantum dot have to do so one 


by one. This fact, combined with the discrete- 
ness of the quantum dot’s energy spectrum, 
makes the dots ideal sources of single electrons’. 

The usual way to extract a single electron 
from a quantum dot is to raise the last occu- 
pied energy level of the dot to well above the 
characteristic energy (the Fermi level) of the 
electronic reservoir to which the dot is coupled. 
This can be done with the help of an electro- 
static ‘gate’ electrode. In this manner, the 
electron ‘sitting’ on the last occupied energy 
level is forced energetically to ‘fall off’ into the 
electronic reservoir; conversely, an electron can 
be absorbed from the electronic reservoir by 
lowering a previously unoccupied energy level 
below the Fermi level. Because an electron 
emitted in such a way rapidly mixes with other 
electrons in the electronic reservoir, knowledge 
of that electron’ initial electronic state will be 
deficient, and any quantum information stored 
in the electron will be lost. This explains why 
extracting an electron from a dot and capturing 
it in another one is far from trivial. 

To isolate a single electron and implement 
a single-electron bus, Hermelin et al.' and 
McNeil et al.” took an alternative approach that 
involves moving a quantum dot rather than 
acting directly on its energy levels. The basic 
idea is to distort the electrostatic potential that 
traps the dot’s electron, using an acoustic wave 
that propagates across the surface of the device 
hosting the dot. The acoustic wave, which is 
induced by a microwave pulse, allowed the 
authors to expel a single electron from the dot 
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and, subsequently, to transfer it to a receiving 
dot through an empty channel, in which the 
electron ‘surfs’ on the acoustic wave. 

Hermelin et al. and McNeil et al. demon- 
strated successful single-electron transfer 
between the dots by detecting coincident 
emission and capture at both dots. These were 
detected with devices that are routinely used 
for charge detection in quantum dots’: sensi- 
tive electrometers that are placed closed to the 
dots. The efficiency of the authors’ approach’* 
was such that it allowed, for example, McNeil 
et al.’ to reliably transfer single electrons back 
and forth between the dots over a cumulative 
distance of about 0.25 millimetres — nearly a 
macroscopic distance. Hermelin et al. went on 
to show that, after initially loading a dot with 
two electrons, it is possible to split them apart: 
one stays in the dot and the other is captured 
by a receiving dot. 

These experiments” are particularly rel- 
evant with a view to using single-electron 
buses for retrieving and distributing quan- 
tum information stored in quantum dots 
that are embedded in complex networks. It 
has been shown’® that electronic spin can 
be manipulated quantum mechanically 
with ever-increasing fidelity. It is therefore 
possible to imagine manipulating an informa- 
tion-encoding single spin in one quantum dot, 
then transporting it to another distant dot in 
the network. What’s more, by using a double 
quantum dot, one could foresee the creation 
ofan arbitrary two-electron superposition spin 
state’ and its transfer between distant quan- 
tum dots. This would pave the way for studying 
quantum entanglement of two electrons in a 
solid-state environment. 

All of these exciting possibilities offered 
by the set-ups of Hermelin et al. and McNeil 
et al. require that single-electron transfers do 
not degrade quantum information, an aspect 
that is not addressed in their work. Because 
new electron-manipulation techniques always 
come with unexpected dissipation mecha- 
nisms, it is not clear whether the electrons 
can retain their spin state, and so the encoded 
information, during their travels in the 
channel. However, recent advances in spin 
manipulation and control”® call for optimism. 
We should therefore be confident that the 
demonstrated single-electron bus will go 
quantum in the not-so-distant future. m 
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Endless variation 
most beautiful 


The genetic basis of traits can be understood by comparing the DNA of varieties 
of the same species. The genomes of many varieties of a model plant organism 
have now been sequenced, and the results are revelatory. SEE ARTICLE P.419 


MICHAEL BEVAN 


( orm Darwin wrote’ of the “endless 
forms most beautiful” of species that 
have arisen from natural selection. But 

his words also apply to the genetic variation 

within species such as the highly adaptable 
plant Arabidopsis thaliana (Fig. 1). The first 
analyses of the sequences of multiple genomes 
of A. thaliana”, including one on page 419 of 
this issue by Gan et al.*, have now been pub- 
lished. These studies provide a foundation 
for identifying the factors that shape genome 
change, and for mapping genome-sequence 
variation among a wide range of A. thaliana 
varieties that represents the plant's diversity. 

They should also facilitate the association of 

phenotypes (the observable characteristics 

of an organism) with genotypes (inherited 
genetic information) — most importantly in 
crop plants. 

The genome sequences of most organisms 
are represented by examples taken from a 
single individual of each species, the choice 
of which has often been haphazardly forced 
on researchers by technological and financial 
limitations. New sequencing technologies, such 
as the Illumina methods* used in the latest stud- 
ies’ *, have removed the need for such arbitrary 
choices and provided exciting opportunities 
to explore sequence diversity within species. 
There is a huge range of organisms that can be 
studied, but it can be argued that two classes 
of genome deserve high priority for diversity 
studies: human genomes, because of the endur- 
ing interest in our origins and diseases; and 
plant genomes, because of humans’ complete 
dependence on them for food. 

Arabidopsis has an evanescent and oppor- 
tunistic life cycle, much to the chagrin of gar- 
deners. The plant colonized extensive regions 
of Eurasia and North Africa after the most 
recent ice age, spreading from refuges in the 
Iberian Peninsula and central Asia®. Sub- 
sequent European colonization then allowed 
it to spread worldwide. Arabidopsis is a widely 
used experimental plant because it is compact 
and easily grown, has a rapid life cycle and 
self-pollinates. Its compact genetic code was 
the first plant genome to be sequenced’, pro- 
viding an accurate foundation for the current 
studies” *. 


Figure 1 | Small but successful. The unassuming 
plant Arabidopsis thaliana has colonized the world. 
The genome sequences” * of many varieties of 

A. thaliana should reveal the genetic basis of its 
adaptive traits. 


Preliminary analyses* of Arabidopsis 
genomes showed that plants from different 
geographical locations exhibit many com- 
monly held genetic variations, consistent with 
the comparatively recent spread of the plant 
from a few locations and with frequent mix- 
ing of populations. The preliminary work also 
revealed that genome-wide association studies 
(GWAS) hold exceptional promise for identi- 
fying sequence variations that affect a wide 
range of plant phenotypes, many of which 
could be useful in agricultural crops. 

As they report in Nature Genetics, Cao et al.’ 
have now sequenced the genomes of 80 strains 
of Arabidopsis that represent the genetic diver- 
sity of the plant across its extensive geographi- 
cal range. By sequencing short pieces (or 
reads) of DNA and mapping them to a refer- 
ence Arabidopsis genome, the authors identi- 
fied single nucleotide polymorphisms (SNPs) 
— sequence variations between strains that 
involve single nucleotides. DNA sections that 
couldn't be mapped to the reference genome 
in this way were assembled de novo and then 
checked to see whether they could be anchored 
to the reference through the alignment of base 
pairs. This allowed more-extensive sequence 
variation to be identified. 

In a related study published in the 


NEWS & VIEWS | RESEARCH | 


Proceedings of the National Academy of Sciences, 
Schneeberger et al.’ sequenced four varieties 
of Arabidopsis using a ‘sub-assembly’ approach. 
This involved clustering short reads into 
groups that correspond to certain regions of a 
reference genome’, assembling the reads into 
continuous sections (blocks), and then assem- 
bling the blocks into larger and larger sections 
until the whole genome was constructed. Gan 
et al.* also used a sub-assembly approach to 
sequence 18 Arabidopsis varieties. 

The advantage of sub-assembly approaches 
is that, as far as possible, different genomes 
are assembled independently. This changes 
the focus of comparative genomics: instead 
of comparing one or many genomes with a 
single reference, many genomes are compared 
with each other. Sub-assembly approaches 
also capture a spectrum of sequence variation 
broader than changes involving just a few 
nucleotides, thereby allowing a fundamental 
re-evaluation of sequence variation within a 
species. Indeed, this is one goal of the ambi- 
tious 1001 Genomes Project’, of which the 
three latest papers” * are part. The project has 
already sequenced 471 Arabidopsis varieties, 
and has a further 706 in its pipeline. 

The new studies’ identified extensive 
genome-sequence changes between varieties. 
For example, SNPs and copy-number variants 
(differences in the number of duplications of 
one or more sections in a genome) are fre- 
quent. Another finding is that a significant 
proportion of Arabidopsis variation involves 
chemical changes to methylated cytosine 
bases. Cytosines are often methylated dur- 
ing epigenetic modifications to DNA, which 
alter gene expression without affecting DNA 
sequence, and the latest data”* suggest that 
such epigenetic changes have the potential to 
cause mutations. Furthermore, large numbers 
of genes in different Arabidopsis varieties con- 
tain premature stop codons (short sequences 
that signal the termination of translation), 
which would probably adversely affect the 
functions of proteins encoded by those genes. 
Interestingly, many of these genes also con- 
tain compensatory changes that would be 
expected to restore protein function. The most 
dramatic sequence variations were detected 
mainly in the studies that used sub-assembly 
approaches”, indicating that these approaches 
should be adopted in the future to maximize 
detection of a wide range of genomic variation 
in Arabidopsis and in crop plants. 

The greatest variability within Arabidop- 
sis was found in genes involved in defence 
and responses to the environment”; the 
same is true of sequence variation between 
species and between taxonomic families. 
Gan et al.* assessed differences in gene 
expression between 18 naturally occurring 
varieties, and found that extensive genetic vari- 
ation was concentrated within a short section 
(100 base pairs) that controls the expression 
of an adjacent gene. This accounted for much 
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of the variation in expression between strains. 
The differences in expression affected other 
genes, mainly those involved in responses 
to pathogens and those encoding a family of 
transcription factors that control flowering. 

The broad spectrum of genomic change 
identified in the three landmark studies” *can 
be used to associate phenotypes — including 
‘quantitative’ phenotypes that underlie complex 
traits — with sequence variation. With this in 
mind, Gan et al.’ sequenced the 18 diverse 
varieties of Arabidopsis that have been inter- 
crossed to create a structured population” used 
for mapping complex traits to DNA sequence 
variation. By contrast, Cao and co-workers’ 
genomic data’, along with other data from the 
Arabidopsis 1001 Genomes Project, can be used. 
to relate phenotypic variation to the underlying 
genotypic variation observed in GWAS. 

Comparative genome studies have already 
provided many useful results. A pioneer- 
ing study* that examined 107 phenotypes of 
Arabidopsis in 96 diverse, genotyped lines 
found associations between several adaptive 
phenotypes and sequence variations. Similar 
approaches have been used ina study of 517 
local varieties of rice to identify genetic vari- 
ation associated with 14 useful agronomic 
traits''. Another genome-wide study has 
shown that adaptation of certain strains of 
Arabidopsis to high salt conditions is associ- 
ated with sequence variation in a gene that 
encodes a sodium-transporter protein”. 
GWAS in general are also showing exceptional 
promise for identifying causal sequence varia- 
tion in complex emergent traits in plants, such 
as crop yield and quality”. 

The application of high-throughput DNA 
sequencing and genome-capture technol- 
ogy will inevitably lead to the large-scale re- 
sequencing of the genomes of crop species, in 
much the same way that the tiny, relatively sim- 
ple Arabidopsis genome has been re-sequenced 
in the three new studies” *. These technologies 
will revolutionize plant breeding by enabling 
a wide variety of phenotypic variations to be 
mined for their associated sequence variations, 
which can then be used to select breeding 
lines’. This will substantially reduce the time 
taken to create varieties of crop plants that are 
adapted to cope with changes in growing con- 
ditions or new pathogens, and/or to improve 
crop yield. m 
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Precisely tuned 
antibodies nab HIV 


Newly discovered neutralizing antibodies that target sites on the envelope 
proteins of HIV-1 provide a window on how some of the most powerful of these 
antibodies acquire their potency and breadth of activity. SEE LETTER P.466 


PAUL R. CLAPHAM & SHAN LU 


he huge variation among HIV-1 viruses 

presents a formidable challenge for the 

development of an effective vaccine: a 
successful vaccine would probably need to elicit 
the production of broadly acting neutralizing 
antibodies. The challenge is even more daunt- 
ing when one considers that HIV-1 has evolved 
many ways to evade the neutralizing antibod- 
ies of infected subjects’. Nevertheless, a few 
HIV-infected individuals (dubbed ‘lite neu- 
tralizers’) do develop antibodies that potently 
nullify diverse HIV-1 strains. The discovery 
and characterization of these powerful antibod- 
ies is gaining momentum — as evidenced by 
three exciting new studies” *, including one by 
Walker et al.” on page 466 of this issue. 

Until 2009, it was difficult to isolate broadly 
neutralizing antibodies against HIV-1, and 
only a few had been described (Table 1). Such 
antibodies are generally derived as monoclonal 


antibodies, obtained from individual B cells — 
the immune cells that function as factories for 
antibody production. Two such antibodies 
(2F5 and 4E10) target the transmembrane 
glycoprotein gp41 of HIV-1, which anchors 
the viral envelope to the underlying virus 
particle*® (Fig. 1). Two other antibodies (b12 
and 2G12) target the outer envelope protein 
gp120. Of these, b12 blocks the binding site to 
CD4 (ref. 7), the main receptor for HIV on the 
surface of immune cells called T cells, whereas 
2G12 binds to glycans®, sugar components on 
the HIV-1 envelope. Over the past two years, 
a succession of studies has identified further 
monoclonal antibodies that target gp120. 
Antibodies that bind to newly discovered sites 
overlapping the CD4 binding site — HJ16 and 
particularly VRCO1 — show greatly improved 
breadth of action and potency over those 
already known®”, and the antibodies PG9, 
PG16 and CHO1 to CH04 target a previously 
unknown quarternary structure on gp120 that 


TABLE 1 | A HISTORY OF HIV-1 NEUTRALIZING ANTIBODIES 


Specificity 
Monoclonal Year of Envelope Glycan 
antibody discovery glycoprotein Target site involvement Ref. 
255) 1993 gp41 MPER No 5) 
4E10 1994 gp41 MPER No 6 
b12 1994 gp120 CD4bs No 7 
2G12 1994 gp120 Glycan Yes 6 
structure 
PG9 2009 gp120 V2/V3 loops Yes 10 
PG16 
CHO1-04 2010 11 
PGT141 2011 2 
VRCO1 2010 gp120 CD4bs No 9 
VRC-PGO4 2011 No 4 
3BNC60 2011 No 3 
HJ16 2010 gp120 CD4bs No 8 
PGT121 2011 gp120 V3 loop Yes 2 
PGT125 
PGT135 
Each of the listed antibodies originated from a single HIV-1-positive elite neutralizer. MPER, membrane-proximal region of gp41; CD4bs, 


CD4-binding site. 
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transcription factors that control flowering. 
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‘quantitative’ phenotypes that underlie complex 
traits — with sequence variation. With this in 
mind, Gan et al.’ sequenced the 18 diverse 
varieties of Arabidopsis that have been inter- 
crossed to create a structured population” used 
for mapping complex traits to DNA sequence 
variation. By contrast, Cao and co-workers’ 
genomic data’, along with other data from the 
Arabidopsis 1001 Genomes Project, can be used. 
to relate phenotypic variation to the underlying 
genotypic variation observed in GWAS. 

Comparative genome studies have already 
provided many useful results. A pioneer- 
ing study* that examined 107 phenotypes of 
Arabidopsis in 96 diverse, genotyped lines 
found associations between several adaptive 
phenotypes and sequence variations. Similar 
approaches have been used ina study of 517 
local varieties of rice to identify genetic vari- 
ation associated with 14 useful agronomic 
traits''. Another genome-wide study has 
shown that adaptation of certain strains of 
Arabidopsis to high salt conditions is associ- 
ated with sequence variation in a gene that 
encodes a sodium-transporter protein”. 
GWAS in general are also showing exceptional 
promise for identifying causal sequence varia- 
tion in complex emergent traits in plants, such 
as crop yield and quality”. 

The application of high-throughput DNA 
sequencing and genome-capture technol- 
ogy will inevitably lead to the large-scale re- 
sequencing of the genomes of crop species, in 
much the same way that the tiny, relatively sim- 
ple Arabidopsis genome has been re-sequenced 
in the three new studies” *. These technologies 
will revolutionize plant breeding by enabling 
a wide variety of phenotypic variations to be 
mined for their associated sequence variations, 
which can then be used to select breeding 
lines’. This will substantially reduce the time 
taken to create varieties of crop plants that are 
adapted to cope with changes in growing con- 
ditions or new pathogens, and/or to improve 
crop yield. m 
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Until 2009, it was difficult to isolate broadly 
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only a few had been described (Table 1). Such 
antibodies are generally derived as monoclonal 


antibodies, obtained from individual B cells — 
the immune cells that function as factories for 
antibody production. Two such antibodies 
(2F5 and 4E10) target the transmembrane 
glycoprotein gp41 of HIV-1, which anchors 
the viral envelope to the underlying virus 
particle*® (Fig. 1). Two other antibodies (b12 
and 2G12) target the outer envelope protein 
gp120. Of these, b12 blocks the binding site to 
CD4 (ref. 7), the main receptor for HIV on the 
surface of immune cells called T cells, whereas 
2G12 binds to glycans®, sugar components on 
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monoclonal antibodies that target gp120. 
Antibodies that bind to newly discovered sites 
overlapping the CD4 binding site — HJ16 and 
particularly VRCO1 — show greatly improved 
breadth of action and potency over those 
already known®”, and the antibodies PG9, 
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includes binding sites in its V2 and V3 loops 

Walker et al.” describe a diverse set of mono- 
clonal antibodies that bind to a newly identi- 
fied site at the base of the V3 variable loop. For 
antibodies from at least one of the HIV-infected 
subjects, this site includes glycans, which, 
ironically, usually assemble on the envelope 
to prevent antibody binding. These new anti- 
bodies are tenfold more potent than previously 
described broadly acting monoclonal antibod- 
ies, but maintain activity against diverse HIV-1 
viruses even at relatively low doses. 

In the two other reports, Scheid et al.’ and Wu 
et al. investigate large numbers of newly identi- 
fied monoclonal antibodies that they amplified 
from samples obtained from several elite neu- 
tralizers; many of these antibodies have similar 
specificity and breadth of activity to VRCO1 
— until now the most potent and broadly act- 
ing antibody. Scheid and colleagues use a new 
approach that increases the recovery of anti- 
body genes from B cells and, notably, they con- 
firm that such antibodies were circulating in the 
serum ofan elite neutralizer. Both reports are 
a reassuring indication that VRCO1-like anti- 
bodies are not extremely rare but are repeatedly 
generated, at least by elite neutralizers. 

The antibodies present in adult human 
blood recognize hundreds of millions of 
different foreign proteins. Yet the human 
genome has only a limited number of genes 
that encode antibodies. Antibody diversity 
is partly achieved by a process called affinity 
maturation, which generates mutations in 
the antibody genes in B cells. If single B cells 
carrying mutated antibodies on their surface 
bind efficiently to a target protein antigen, this 
stimulates the cells to multiply and to produce 
larger amounts of the antibodies. To date, all 
of the broad and potent neutralizing antibod- 
ies that target the HIV envelope (particularly 
VRCO01-like antibodies) have been shown to 
contain many mutations and seem to have 
gone through repeated rounds of mutagenesis 
and selection of antibody variants. 

Wt et al.* combine in-depth sequencing 
and crystal-structure analyses to examine the 
diversity of VRCO1-like antibody sequences 
in HIV-infected patients. Their data provide a 
fascinating insight into the families of VRCO1- 
like antibody variants in individual subjects: 
there is an increasing divergence from the orig- 
inal antibody gene, the breadth and potency 
increasing with the number of mutations or 
amino-acid substitutions in the antibody 
sequence. Remarkably, the most broadly acting 
and potent antibodies from the infected indi- 
viduals had converged on structures similar to 
VRCO1 that optimized interactions with the 
site on the envelope that makes the first contact 
with CD4. The related experiments of Scheid 
et al. support these observations. 

Studies over the past few years, culminating 
in the reports summarized here” *, have shown 
beyond doubt that HIV-1-positive subjects can 
elicit potent and broadly active neutralizing 
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Figure 1 | Spike-studded HIV-1. The envelope of HIV-1 carries spikes. a, Each spike is made of three 
molecules of the surface glycoprotein gp120 and three molecules of the transmembrane glycoprotein 
gp41. Glycoprotein gp120 contains variable V1/V2 and V3 loops, as well as the binding site for CD4. 
b, The binding sites of broadly acting and potent HIV-1-specific neutralizing antibodies are shown as 
coloured circles. The target sites investigated by the new studies — a site at the base of the V3 loop’ and 


the CD4-binding site**— are marked by green circles. 


antibodies and that there are several exposed 
and relatively invariant sites on the viral 
envelope that are vulnerable to such antibodies. 

These observations have obvious implica- 
tions for the development of HIV vaccines. 
However, the optimism they bring needs to be 
tempered with caution, at least for now. This is 
because it is unclear how immunogenic mol- 
ecules that will elicit such antibodies could be 
designed and administered. The current studies 
investigate the neutralizing antibodies them- 
selves and the structure of envelope target mol- 
ecules to which they bind, but do not address 
the design of immunogens or immunization 
strategies that could induce similar antibodies. 

Nevertheless, the studies highlight two 
major problems. First, several target molecules 
of the broad, potent antibodies — including 
the glycans identified by Walker et al.” — are 
notoriously inefficient as immunogens, and 
it is not clear how the dramatic improve- 
ments required to make them effective could 
be achieved. Second, as discussed, the most 
potent and broadly acting antibodies have 
undergone substantial affinity maturation 
over a long period. It is not known whether 
the same features can be achieved with 
limited vaccinations in a much shorter pro- 
cess. Wu and colleagues* argue that knowing 
the structures and sequences of antibodies that 
are intermediates or precursors to the most 
effective versions will help in designing and 
identifying envelope immunogens that can 
steer developing antibodies towards breadth 
and potency. 

Key questions remain. Why do so few HIV- 
infected individuals develop broad and potent 
neutralizing antibodies? And are there further, as 
yet unknown, envelope targets for neutralizing 


antibodies? One final piece of information in 
Walker and co-workers’ report is derived from 
a concept in traditional vaccinology. Using 
theoretical modelling, they argue that a com- 
bination of antibodies possessing different 
target specificities will increase the breadth and 
potency of neutralization, as described ina pre- 
vious study”, Taking into account the increasing 
number of potential targets for broadly neutral- 
izing antibodies on the HIV-1 envelope, it is 
possible that a vaccine able to elicit a combina- 
tion of antibodies targeting different sites will 
confer breadth, potency and protection, even 
if individual antibody components don't reach 
maximal neutralizing concentrations. = 
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Multiple reference genomes and 
transcriptomes for Arabidopsis thaliana 


Xiangchao Gan'*, Oliver Stegle”*, Jonas Behr**, Joshua G. Steffen**, Philipp Drewe**, Katie L. Hildebrand, Rune Lyngsoe®, 
Sebastian J. Schultheiss*, Edward J. Osborne’, Vipin T. Sreedharan*, André Kahles?, Regina Bohnert?, Géraldine Jean’, 
Paul Derwent’, Paul Kersey’, Eric J. Belfield®, Nicholas P. Harberd®, Eric Kemen’, Christopher Toomajian®, Paula X. Kover!°, 


Richard M. Clark*, Gunnar Ritsch*® & Richard Mott! 


Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and 
until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report 
the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. 
When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in 
at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding 
potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis 
variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most 
pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional 
studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. 


Interpreting the consequences of genetic variation has typically relied on 
a reference sequence, relative to which genes and variants are annotated. 
However, this may cause bias, because genes may be inactive in the 
reference but expressed in the population’, suggesting that sequencing 
and re-annotating individual genomes is necessary. Advances in 
sequencing’ make this tractable for Arabidopsis thaliana’, whose 
natural accessions (strains) are typically homozygous. Relative to the 
119-megabase (Mb) high-quality reference sequence from Col-0 
(ref. 6), diverse accessions harbour a single nucleotide polymorphism 
(SNP) about every 200 base pairs (bp) (ref. 3), and indel variation is 
pervasive*”®. Characterizing this variation is crucial for dissecting the 
genetic architecture of traits by quantitative trait locus mapping in 
recombinant inbred lines (see, for example, ref. 9) or genome-wide 
association in natural accessions’. 

Here we have sequenced and accurately assembled the single-copy 
genomes of 18 accessions that, with Col-0, are the parents of more 
than 700 Multiparent Advanced Generation Inter-Cross (MAGIC) 
lines’, similar to the maize Nested Association Mapping (NAM)” 
population and the murine Collaborative Cross'”. These accessions 
comprise a geographically and phenotypically diverse sample across 
the species’. Using the genomes, seedling transcriptomes and com- 
putational gene predictions we have characterized the ancestry, 
polymorphism, gene content and expression profile of the accessions. 
We show that the functional consequences of polymorphisms are 
often difficult to interpret in the absence of gene re-annotation and 
full sequence data. The assembled genomes also contribute to the 
A. thaliana 1001 Genomes Project?>”’. 


Genome sequencing, assembly and variants 


We assembled the 18 genomes so that single-copy loci would be 
contiguous, with less than one assembly error per gene, and therefore 


suitable for annotation. Accessions were sequenced with Illumina 
paired-end reads* (Supplementary Table 1), generally with two 
libraries with 200-bp and 400-bp inserts and reads of 36 and 51 bp, 
respectively, to between 27-fold and 60-fold coverage. Each genome 
was assembled by using five cycles of iterative read mapping’* com- 
bined with de novo assembly’ (Supplementary Information sections 2 
and 3, and Supplementary Tables 1 and 2). We aligned reads to the 
final assemblies to detect polymorphic regions’ lacking read coverage 
(2.1-3.7 Mb per accession; Supplementary Table 3 and Supplemen- 
tary Fig. 2). At unique loci, polymorphic regions probably reflect 
complex polymorphisms**. The average N50 length (the contig size 
such that 50% of the entire assembly is contained in contigs equal to 
or longer than this value) of contiguous read coverage between poly- 
morphic regions was 80.8 kb (Supplementary Table 4). 

To report complex alleles consistently, we defined all variants 
against the multiple alignment consensus of Col-0 and the assembled 
genomes. For each accession there were 497,668-789,187 single-base 
differences from Col-0, and about 45,000 ambiguous nucleotides 
(Supplementary Table 5). The latter may reflect heterozygosity 
(particularly in Po-0; Supplementary Figs 5-7) or copy-number 
variants, and they were largely in transposable elements and repeats 
covering 21.9% of the genome (Supplementary Information section 
5.1, and Supplementary Figs 8 and 9). Of 3.07 million SNPs, 45.2% 
were private to single accessions. 

We identified 1.20 million indels, and 104,090 imbalanced sub- 
stitutions, in which a sequence in Col-0 was replaced by a different 
sequence (Supplementary Tables 3 and 7). Although 57.5% of indels 
or imbalanced substitutions were shorter than 6 bp, 1.9% were longer 
than 100 bp, and overall 14.9 Mb of Col-0 sequence was absent in one 
or more accessions (Fig. 1a and Supplementary Fig. 8). The assemblies 
were about 1.6% and about 4.3% shorter than the reference (including 
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Figure 1 | Assembly and variation of 18 genomes of A. thaliana. 

a, Classification of sequence, SNPs and indels based on the Col-0 genome. 

b, Assembly accuracy (y axis; base substitution errors per 10 kb) measured 
relative to four validation data sets at each of eight stages in the IMR/DENOM 
assembly pipeline (x axis). Bur-0 survey (blue line): 1,442 survey sequences 
(about 417 bp each) in predominantly genic regions’; Bur-0 divergent (red 
line): 188 sequences (each about 254 bp) highly divergent from Col-0 (ref. 3); 
Ler-0 nonrepetitive (orange line): a predominantly single-copy 175-kb Ler-0 
sequence on chromosome 5; Ler-0 repetitive (purple line): a highly repetitive 
339-kb Ler-0 locus on chromosome 3 (ref. 18; Supplementary Information 
section 4). Iter, iteration. c, Genome-wide distribution of the minimum clade 
size for all pairs of accessions (excluding Po-0). Each pair is represented by a 
grey line, the mean over all pairs by the black line and the random distribution 
by the green line. d, Decay in linkage disequilibrium with distance (Po-0 
excluded). The black line shows r* between SNPs; the red line shows 
phylogenetic r* (Supplementary Information section 6). 


and excluding polymorphic regions, respectively), probably reflecting 
limitations in detecting long insertions. Although sequence differ- 
ences were enriched in transposable-element and intergenic regions, 
about 17% of bases deleted in one or more accessions were annotated 
as genic in Col-0 (Fig. la and Supplementary Fig. 8). The density of 
as are differences is greater than between ase inbred strains of 
mice’, but less than between lines of maize’” 

Both iterative and de novo assembly improved accuracy, with the 
latter being most effective at divergent loci (Fig. 1b, Supplementary 
Table 2 and Supplementary Fig. 10). As assessed with about 1.2 Mb of 
genomic dideoxy data*'*"” (Supplementary Information section 4), the 
substitution error rate was about 1 per 10kb in single-copy regions, 
and about tenfold higher in transposable-element-rich regions. 
Further, RNA-seq reads covered about 100,000 SNPs per accession 
with 99.72% concordance (Supplementary Table 5), and junction 
sequences for 66 of 68 (97%) long indels and imbalanced substitutions 
were confirmed by PCR and dideoxy sequencing (Supplementary 
Table 8). The substitution error rate for our assemblies was comparable 
to that reported for four other A. thaliana genome assemblies’. 


Genome-wide patterns of ancestry 


The ancestral relationships of the accessions vary genome-wide. We 
computed phylogenies” across 1.25 million biallelic, non-private SNPs 
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(Supplementary Information section 6). The ancestry of each pair of 
accessions within a phylogeny was quantified by using the genome- 
wide distribution of the minimum clade size of the subphylogeny 
containing the pair (Fig. 1c). Despite their wide geographical origins, 
with the exception of Po-0 and Oy-0, all pairs have distributions 
similar to that of an unstructured sample. The probability of recent 
co-ancestry is slightly higher than expected for a few pairs of accessions, 
with extended haplotype sharing at a minority of loci (Supplemen- 
tary Figs 11-15), perhaps reflecting selective sweeps’. Both linkage 
disequilibrium and correlation between neighbouring phylogenies 
decrease by 50% within 5kb (Fig. 1d and Supplementary Fig. 16). 
Variation among the 18 accessions is similar to a diverse global 
A. thaliana sample’* in nucleotide diversity (Supplementary Figs 11- 
15), correlation with genomic features (Supplementary Tables 9-12) 
and structural variants (Supplementary Fig. 17). 


Gene annotation and transcript diversity 


A naive projection of the coordinates of the 27,206 nuclear protein- 
coding genes from Col-0 (TAIR10 annotation) onto the 18 genomes 
predicted that 93.4% of proteins were changed in at least one acces- 
sion, with 32% of the total being affected by genic deletions, pre- 
mature termination codons, or other disruptions (Supplementary 
Table 13). This large tally of disrupted genes implies that reference 
annotations cannot be transferred reliably; in fact, re-annotation 
reveals compensating changes, ensuring that many genes encode 
apparently functional proteins (Fig. 2a). Thus, in 96.2% of the 8,757 
genes affected, the naive annotations were replaced by an alternative 
gene model in at least one accession (Fig. 2b and Supplementary 
Fig. 18). We predicted new splice sites in 64% of the 2,572 genes with 
splice site disruptions (in 696 cases the new sites were within 30 bp of 
the original ones; see, for example, Fig. 2a). Finally, there was evidence 
of alternative splicing in 2,106 genes (Supplementary Information 
sections 10.10-10.13). 

For genome annotation and expression analyses (for example 
Figs 2-4), we generated 78-bp RNA-seq reads from two biological 
replicates of seedling mRNA (about 9.5 million mapped reads per 
accession, including Col-0; Supplementary Information section 9, 
and Supplementary Table 14). We integrated read alignments” with 
sequence-based gene predictions” by using mGene.ngs (Supplemen- 
tary Information sections 9-10.3, and Supplementary Fig. 19). On 
average, 24,681 coding genes were predicted for each accession. 
Comparison of Col-0 de novo predictions with TAIR10 annotations 
(Supplementary Table 16) showed that these predictions are more 
accurate (transcript F-score 65.2%) than using the genome sequence 
(mGene”, 59.6%) or RNA-seq alignments alone (Cufflinks”’, 37.5%; 
Supplementary Table 17). Finally, we consolidated the de novo anno- 
tations by incorporating TAIR10 annotations where applicable 
(Supplementary Information section 10.4, and Supplementary Fig. 
20); novel transcript structures for a known TAIR10 gene were only 
accepted if each newly predicted intron was confirmed by RNA-seq 
alignments, or if the reference gene model was severely disrupted. 

We found, on average, 42,338 transcripts per accession (excluding 
Col-0), of which 5.5% (2,316) were novel (Table 1 and Supplementary 
Table 18). In each accession there were, on average, 319 novel genes 
(or gene fragments) supported by RNA-seq (Table 1); 717 novel genes 
were found in total, 496 whose sequence was present in Col-0 but not 
annotated, and 221 absent from the Col-0 genome but present in the 
de novo assemblies of the accessions. We found protein or expressed 
sequence tag matches for 74.9% of the new genes, primarily from 
A. thaliana, A. lyrata or other Brassicaceae species (Supplementary 
Information sections 10.8 and 10.9). 

For accession Can-0, we generated additional independent higher 
coverage RNA-seq data from seedling, root and floral bud, which we 
used to confirm 83.3% of re-annotated introns (read alignment over 
splice junction) and 59.9% of transcripts (confirmation of every 
intron, or read coverage of 50% of the transcript for single exon 
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Figure 2 | Transcript and protein variation. a, Example of a splice site change 
between two haplotypes for the gene AT1G64970. Haplotype I (Col-0) is spliced 
with an intron 6 bp (two amino acids) shorter than haplotype II (Ler-0); Po-0 
(heterozygous) shows allele-specific expression of both. b, Re-annotation of the 
FRIGIDA locus showing annotations for accessions Sf-2 (functional), and Col-0 
(truncated by a premature stop) and Ler-0 (non-functional) (Supplementary 
Figs 18 and 42). Right: the 19 accessions are shown clustered on the basis of the 
AA distance between their FRIGIDA amino-acid sequences. Common isoform 
clusters (at distance 2% or less; red line) are shown, leading to three clusters 
with three, seven and nine accessions. c, Proteome diversity for coding genes, 
pseudogenes and A. lyrata genes (top) and for genes with disruptions (bottom). 
Reported is the fraction of genes with relative AA distance to other accessions 
(average over pairs) in the given colour-coded interval (Supplementary 
Information section 10.7). d, Frequency of isoforms of coding genes and 
pseudogenes (top), and those associated with different disruptions (bottom). 


transcripts; Table 1). We also obtained additional RNA-seq data for 
Col-0 and found similar confirmation rates for the reference annota- 
tion (Supplementary Table 19). Moreover, for Can-0 we confirmed 
72.1% and 84.2% of novel introns and transcripts. Many novel introns 
stemmed from splice disruptions that tended to be weakly expressed 
so RNA-seq evidence was scarcer (Supplementary Fig. 22). Finally, 
more than 75% of novel alternative splicing events were supported by 
RNA-seq (Supplementary Information section 10.5). 


Proteome diversity 


To understand the effect of genetic diversity on proteins, it is insuf- 
ficient to study isolated DNA polymorphisms in the context of the 
reference annotation. We therefore defined the distance between two 
amino-acid (AA) sequences by the fraction of amino-acid residues that 
did not align identically in their global alignment. For example, for 
FRIGIDA, between Col-0 and Sf-2, a premature stop codon leads to 
an AA distance of 49% (Fig. 2b). In 77% of proteins, the mean AA 
distance between all accessions was less than 3% (Fig. 2c). However, on 
average, 747 proteins per accession had a distance larger than 50% to 
any TAIR1O0 protein, with markedly greater variation for pseudogenes. 
As expected, variation between A. thaliana and its congener A. lyrata™* 
exceeds that observed among A. thaliana accessions (Fig. 2c and 
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Supplementary Fig. 23). Disruptions to splice sites and translation start 
and stop codons typically caused less severe effects than premature stop 
codons or frame shifts (Fig. 2c) when compensating splice sites created 
alternative in-frame splicing (for example Fig. 2a and Supplementary 
Fig. 24). 

Next, we identified protein isoforms across accessions (Fig. 2b, 
right; distinct isoforms differ by at least roughly 2% AA distance; 
Supplementary Information section 10.7). For 80% of protein coding 
genes the most frequent isoform was very common (frequency at least 
15 out of 19), whereas isoforms for pseudogenes usually occurred at 
lower frequency. Moreover, isoforms for large disruptions were rare 
(frequency 3 or less) for 37% of affected genes (Fig. 2d). This was most 
pronounced for premature stops and frameshifts, where purifying 
selection is expected to be strongest. 

As expected*”, disease resistance genes of the coiled-coil and Toll 
interleukin 1 receptor subfamilies of the Nucleotide-Binding Leucine 
Rich Repeat (NB-LRR) gene family were predicted to encode the most 
variable proteins (Fig. 4a and Supplementary Fig. 26). F-box and 
defensin-like genes implicated in diverse processes including 
defence” were also highly variable. In contrast, housekeeping genes 
showed little variation. 


Variation in seedling gene expression 


Median expression heritability of protein-coding genes was 39%, sim- 
ilar to that of novel genes (36%) and pseudogenes (38%), and more 
than for non-coding RNAs (30%) (Supplementary Fig. 27). In total, 
75% (20,550) of protein-coding genes (and 21% of non-coding RNAs 
and 21% of pseudogenes) were expressed in at least one accession 
(false discovery rate (FDR) 5%), and 46% (9,360) of expressed pro- 
tein-coding genes were differentially expressed between at least one 
pair of accessions” (Fig. 3a; FDR 5%, Supplementary Information 
section 11). Of these, 19% (1,750) had more than tenfold expression 
changes, and 1.5% (142) more than 100-fold (Fig. 3b). For about 60% 
of genes, at least five accessions contributed to expression variation 
(Fig. 4d; Supplementary Information section 11.8). 

Although the small sample size (19) precludes genome-wide asso- 
ciation scans to identify trans expression quantitative trait loci (eQTLs), 
we identified potential cis-acting nucleotide variants, copy-number 
variants and gene structural variants (for example large indels and gene 
structure changes) associated with expression for 9% (836) of differ- 
entially expressed genes (FDR 5%; Supplementary Information section 
12.2; we assessed gene-copy-number variation as in Supplementary 
Information section 12.4). Much of this variation was highly heritable 
(Fig. 3a). Consistent with identifying likely causal variants, 85% and 
93% of associated SNPs and single-nucleotide indels for cis-eQTLs were 
within 5 and 10 kb of the gene, respectively, and were strikingly con- 
centrated in the 100-bp promoter region and 5’ genic sequences 
(Fig. 3c, d). This was also true for heritable intron retention events, in 
which most cis associations were within the intron or less than 1 kb 
distant (Supplementary Fig. 32). Our results corroborate the general 
findings***’ of extensive cis regulation of gene expression in A. thaliana. 
Neither environmental variation nor population structure markedly 
affected expression variation (Supplementary Information section 13). 
Copy-number and structural variants were associated with expression in 
3% (240) of differentially expressed genes, including 45% (64 out of 142) 
of genes with more than 100-fold differences (Fig. 3b), consistent with 
array studies”’. 

Differential gene expression varied by gene ontology (GO) and 
gene family (Fig. 4b-d, Supplementary Table 24 and Supplementary 
Figs 39-41). Seventeen of the 18 GO classifications that were enriched 
for differential expression (P<10 *) concerned response to the 
biotic environment, including pathogen defence and the production 
of glucosinolates”? to deter herbivores (Supplementary Table 24). 
These include NB-LRR genes (echoing protein variation), of which 
74% were differentially expressed at up to 400-fold change, and for 
which many accessions typically contributed to differential expression 
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Figure 3 | Quantitative variation of coding gene 
expression. a, The overlap between heritable (more 
than 30%) and differentially expressed (FDR 5%) 
genes, and genes with a cis-eQTL (FDR 5%). 

b, Differentially expressed genes and genes with cis- 
eQTLs (FDR 5%) categorized by fold change. 
Nucleotide variants (orange bars; 647 cis-eQTLs) 
are SNPs and single-base indels; copy-number 
variants (green bars; 42 cis-eQTLs) are regions with 
elevated coverage in aligned genomic reads in at 
least one accession; gene structural variants (black 
bars; 227 cis-eQTLs) are accession-specific 
deletions, insertions or changes to the gene model. 
c, The spatial distribution of nucleotide-variant 
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(Fig. 4b-d). Patterns for housekeeping genes (such as ribosomal 
proteins, eukaryotic initiation factors or kinesins) were markedly differ- 
ent: although many were differentially expressed, fold changes were 
generally small, with variation more often being limited to a few acces- 
sions (Fig. 4b-d). Differentially expressed genes generally had much 
higher nucleotide diversity at synonymous sites relative to other 
expressed genes, a pattern also observed but less extreme at non- 
synonymous sites (Supplementary Table 25). This suggests that differ- 
ences in expression level were not due solely to reduced selective 
constraint. 

The type II MADS box transcription factor family*’ showed strik- 
ing expression polymorphisms (Fig. 4b-d), including for the 
FLOWERING LOCUS C_ (FLC)** and MADS AFFECTING 
FLOWERING (MAF) genes*’. FLC, a floral inhibitor expressed highly 
in accessions that require prolonged cold (vernalization) to flower’, 
varied more than 400-fold (Supplementary Fig. 42). F-box and defen- 
sin-like genes were exceptional in that expression was restricted in a 
minority of genes (41% and 12%, respectively; Fig. 4b), perhaps 
reflecting tissue-specific or environment-specific expression*>*”. 


Position relative to gene structure 


eQTLs relative to the start of protein-coding genes 
(FDR 5%, overlapping genes removed; n = 647). 
The line shows density of gene length. 

d, Frequencies of nucleotide-variant eQTLs in 
protein-coding genes, classified by component (bar 
widths are proportional to the components’ average 
physical lengths): red bars, upstream; yellow bars, 5’ 
untranslated region; green bars, coding sequence 
exons; blue bars, introns; cyan bars, 3’ untranslated 
region; grey bars, downstream. 
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Our data suggest that high turnover for some F-box families in the 
A. thaliana lineage’ extends to gene expression as well. 


Conclusion 


Our study goes beyond cataloguing polymorphisms””” to provide 
genome sequences for a moderately sized population sample (see also 
refs 4, 16). In doing so, we were able to annotate each genome largely 
independently of the Col-0 reference. We found that disruptive poly- 
morphisms were frequently compensated for, thereby conserving 
coding potential and highlighting the limitation of inferring conse- 
quences of polymorphisms in the absence of complete sequence data. 

Our assemblies are accurate and largely complete in single-copy 
regions, although additional work will be needed to assemble the 
roughly 20% of the genome comprising repeats and transposable 
elements. Disentangling copy variation, long insertions and other 
genomic rearrangements remains a challenge. The methods we 
developed are of immediate relevance to the broader A. thaliana 
1001 Genomes Project’ and to other organisms, and highlight the 
importance of RNA-seq data for annotation. 
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Figure 4 | Protein diversity and gene expression vary by gene category or 
family. The numbers next to each row are gene counts. The gene families were 
selected from Supplementary Figs 26 and 39-41 to represent the breadth of 
observed variation. a, Distribution of average AA distances to other accessions 
(compare with Fig. 2c). b, Fraction of unexpressed, expressed and differentially 
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expressed genes (expressed is a superset of differentially expressed). 

c, Distribution of genes categorized by fold change (between lowest and highest 
across 19 accessions). d, Distribution of the numbers of accessions contributing 
to differential expression. TF, transcription factor; CC, coiled-coil; TIR, Toll 
interleukin-1 receptor; NB-LRR, nucleotide-binding leucine-rich repeat. 
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Table 1 | Summary of gene predictions 


Total Novel 
Type Per accession RNA-seq Per accession RNA-seq 
confirmed (%) confirmed (%) 
Genes 33,197 62.7 319 88.4 
Transcripts 42,338 59.9 2,316 84.2 
Introns 127,640 83.3 1,345 72.1 
Start codons 33,264 n.a. 503 na. 
Stop codons 33,720 n.a. 528 na. 
Intron retentions 1,192 78.1 873 76.5 
Exon skips 80 80.5 38 767 


‘Total’ and ‘novel’ are average counts over all 19 accessions. ‘RNA-seq confirmed’ gives the percentage 
fully confirmed using independent RNA-seq data (three tissues) for Can-O, the most divergent accession. 


Finally, despite using only 19 accessions, we fine-mapped cis- 
eQTLs to small genomic regions (less than 10kb), suggesting that 
analogous genome-wide scans in the more than 700 derived 
MAGIC lines could have single-gene mapping resolution for some 
loci. Our findings indicate that the MAGIC lines, for which popu- 
lation structure is largely mitigated’, will be an important and com- 
plementary resource to genome-wide association studies in 
A. thaliana populations”. 


METHODS SUMMARY 


We used the same seed stocks for Col-0 and the 18 accessions Bur-0, Can-0, Ct-1, 
Edi-0, Hi-0, Kn-0, Ler-0, Mt-0, No-0, Po-0, Oy-0, Rsch-4, Sf-2, Tsu-0, Wil-2, Ws- 
0, Wu-0 and Zu-0 that originated the MAGIC lines. DNA and RNA sequencing 
was performed with standard (DNA) or modified (RNA-seq) Illumina protocols. 
All methods are described fully in Supplementary Methods; software is available 
from the authors on request. 
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CTCF-binding elements mediate control 
of V(D)J recombination 
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Immunoglobulin heavy chain (IgH) variable region exons are assembled from Vy, D and J, gene segments in developing 
B lymphocytes. Within the 2.7-megabase mouse Igh locus, V(D)J recombination is regulated to ensure specific and 
diverse antibody repertoires. Here we report in mice a key Igh V(D)J recombination regulatory region, termed 
intergenic control region 1 (IGCR1), which lies between the Vy and D clusters. Functionally, IGCR1 uses CTCF 
looping/insulator factor-binding elements and, correspondingly, mediates Igh loops containing distant enhancers. 
IGCR1 promotes normal B-cell development and balances antibody repertoires by inhibiting transcription and 
rearrangement of Dy-proximal Vy gene segments and promoting rearrangement of distal Vy segments. IGCRI 
maintains ordered and lineage-specific V,4(D)J,, recombination by suppressing Vy joining to D segments not joined to 
Jy segments, and V;, to DJ}; joins in thymocytes, respectively. IGCR1 is also required for feedback regulation and allelic 
exclusion of proximal V;;-to-DJ,; recombination. Our studies elucidate a long-sought Igh V(D)J recombination control 
region and indicate a new role for the generally expressed CTCF protein. 


The variable region exons of IgH, Ig light (IgL) and T-cell receptor 
genes are assembled during B- or T-cell development from variable 
(V), diversity (D) and joining (J) gene segments’. The V(D)J recom- 
bination reaction is initiated by RAG endonuclease’. RAG cleaves 
only paired gene segments flanked, respectively, by complementary 
recombination signals (RSs) referred to as 12RSs and 23RSs, a restric- 
tion referred to as the 12/23 rule’. The cleaved segments are then fused 
via classical non-homologous end-joining (C-NHEJ)’. The mouse Igh 
locus contains hundreds of Vj gene segments within a several- 
megabase (Mb) region, followed downstream by a 100 kilobase (kb) 
‘intergenic’ region separating the most downstream Vy (generally 
referred to as Vyjg1x, but formally denoted Vy47183.a2.3; NCBI acces- 
sion number AJ851868)° from Dgy1¢.;, the first of 13 clustered Dy; 
segments. The most downstream D (Dgsz) lies upstream of 4 Jy 
segments (Jy1—Ju4)*. Vig and Jy gene segments are flanked by 
23RSs and D segments are flanked on both sides by 12RSs, ensuring 
that V};(D)Jy assembly involves joining Vy, and Jy; segments to the 
upstream and downstream sides of a Dy segment, respectively*. The 
Igh constant region (Cy) exons lie in the 200-kb region downstream of 
the Jy; segments; RNA splicing fuses productively assembled Vy;(D)Ju 
and Cy; exons during Igh messenger RNA formation. 

Igh V(D)J recombination in developing B cells is regulated to be 
highly ordered and stage specific; thus, Dy-to-Jy joining develop- 
mentally occurs first on both alleles in pre-progenitor (pro)-B cells 
followed by appendage of a Vy to a DJ, complex in pro-B cells**. 
Direct joining of a Vy to an un-rearranged Dy does not occur, even 
though theoretically permitted by the 12/23 rule’. The Vy-to-DJy 
joining step is also regulated to achieve lineage specificity; thus, 
although developing T cells generate DJ}; joins, they do not form 
complete Vy; (D)Jy exons”®. At the pro-B stage, V(D)J recombination 
is regulated in the context of allelic exclusion, with a signal from a 
productive (that is, 1. IgH protein-encoding) Vy(D)Jy rearrangement 


inhibiting Vj,-to-DJ} joining on the other Igh allele, if it is in the DJ y 
configuration’. Expression of the 1 chain also signals development to 
the precursor (pre)-B cell stage and Igi V(D)J recombination’. To 
generate such signals in pro-B cells, 1 IgH chains must pair with 
surrogate IgL chains'®. Subsequently, 1 chains must pair with IgL 
chains in pre-B cells to mediate the pre-B-to-IgM™~ B-cell transition. 
Lastly, Igh V(D)J recombination is regulated to ensure utilization of 
Vy segments across the large Vy; locus. However, proximal Vj; seg- 
ments, particularly Vys1x; are rearranged more frequently than distal 
Vu segments, leading to over-representation in primary Vy(D)Ju 
repertoires’. Repertoire normalization for distal Vy segments in 
mature B cells relies on cellular selection*"', promoted, in part, by 
the inability of certain proximal Vj; segments, including Vysix, to 
pair with surrogate IgL chains and IgL chains’*”. 

V(D)J recombination at all antigen receptor loci is effected by the 
common V(D)J recombinase comprised of RAG and C-NHEJ com- 
ponents. Regulation of Igh V(D)J recombination in the context of 
order/stage, lineage and allelic exclusion is achieved via modulation 
of substrate V, D and J accessibility'*’*. Correlates of such accessibility 
include transcription of un-rearranged gene segments and certain 
DNA and histone modifications*'*~'. Igh locus contraction and loop- 
ing may also mediate higher-order regulation of V(D)J recombination, 
for example by bringing distant Vj, segments into proximity with the 
DJ” **. Until now, cis elements that control order, lineage-specificity, 
allelic exclusion and/or differential V; utilization have been elusive'®”’. 
The only known long-range Igh regulatory elements are a transcrip- 
tional enhancer (termed iE) in the intron between the Jy and Cy 
segments and a set of long-range enhancers (termed the 3’ Igh regula- 
tory region) downstream of the Cy; segments'*”*. The iE transcrip- 
tional enhancer is required for efficient Igh V(D)J recombination, 
particularly Vy-to-DJy joining’’**, although the mechanisms by 
which it influences this process are unknown". Thus far, the 3’ Igh 
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regulatory region has not been implicated in V(D)J recombination”. As 
most critical aspects of Igh V(D)J recombination are regulated at the 
Vu-to-DJ,; step’, relevant regulatory elements may reside in the 
100 kb intergenic region separating the Vj; and Dy segments (see 
Supplementary Discussion)”1°"7'????, 


Role in normal B-cell development 

The region several kilobases upstream of Dy, 16,; harbours chromatin 
modifications”*"** and two CTCF-binding elements (CBEs)””*"** 
suggestive of a potential regulatory region (Supplementary Fig. 1). 
CTCF is an 11-zinc-finger nuclear protein implicated in transcrip- 
tional insulation, chromatin boundary formation, transcriptional 
activation/repression and chromosome looping****. There are several 
other potential cis-elements closely linked to these CBEs including 
potential PU.1-*' and YY1-binding sites (using the JASPAR database). 
We refer to this cluster of factor-binding sites as IGCR1 (Fig. 1). To 
test for a role in Igh V(D)J recombination, we generated an IGCRI1- 
deleted 129SV allele in which the 4.1-kb DNA fragment that contains 
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both the CBEs and other binding sites was deleted in the mouse germ 
line (Fig. la and Supplementary Fig. 2). To test for specific roles of the 
CBEs, we generated mice in which both were replaced with scrambled 
sequences that do not bind CTCF (Supplementary Figs 1 and 3). Mice 
heterozygous or homozygous for the IGCR1 deletion are referred to, 
respectively, as IGCR1*’ and IGCR1 ‘~, and mice heterozygous or 
homozygous for the dual CBE mutation are referred to, respectively, 
as IGCR1/CBE*’~ or IGCR1/CBE '~. Because generation of mutant 
alleles involved loxP insertion, we generated control lines heterozyg- 
ous or homozygous for the JoxP insertion referred to, respectively, as 
loxP*" or loxP " (Fig. 1a). As wild-type, loxP*"' and loxP’ mice gave 
essentially identical results, we refer to them collectively as ‘controls’. 
As a further control, we deleted an approximately 2-kb DNA frag- 
ment downstream of the Dy-proximal end of IGCR1 and found no 
obvious phenotype (Supplementary Fig. 10). 

IGCRI/CBE*‘~ or IGCR1/CBE ‘~ mice had similar splenic IgM* 
B-cell numbers as controls (Fig. 1b and Supplementary Fig. 5a). 
However, IGCR1I/CBE*’~ and, more so, IGCR1/CBE ‘~ mice had 
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Figure 1 | Mutation of IGCR1 CBEs impairs B-cell development. a, Murine 
129SV Igh locus (NCBI accession number: AJ851868) schematic showing the 
4.1-kb IGCR1 region in wild type (WT) compared to IGCR1-deleted, loxP- 
inserted, or CBE-mutated configuration. b, Flow cytometry analysis of IgM _ 
bone marrow (BM) and IgM" splenic B-cell populations in wild-type, loxP" 


and IGCRI/CBE ‘~ mice. In bone marrow the B220"CD43" pro-B and 
B220*CD43~ pre-B cell populations are indicated. c, Expression of IgM* and 
IgM? allotypic markers in bone marrow and spleen from wild-type IgM*/IgM* 
(pure 129SV), wild-type IgM°/IgM? (pure C57BL/6), wild-type F1 (IgM*/IgM”) 
and heterozygous mutant IGCR1/CBE IgM*/wild-type IgM” mice. 
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a substantial diminution in bone marrow pre-B cell numbers (Fig. 1b 
and Supplementary Fig. 5b). As the pro-B-to-pre-B transition is sig- 
nalled by a productive V}j(D)Jy in pro-B cells, this developmental 
defect suggests an Igh V(D)J recombination defect. As a more sensi- 
tive test for the roles of IGCR1 in B-cell development, we bred 129SV 
IGCR1/CBE*’~ mice with C57BL/6 wild-type mice to generate Fl 
mice with a wild-type Igm’ allele and a CBE-mutated Igm’ allele and 
assayed B cells for surface IgM* and IgM? expression. Remarkably, 
whereas normal F1 mice, as expected, have roughly equal numbers of 
IgM*- and IgM?-expressing B cells (but not both due to [gh allelic 
exclusion), most IgM* bone marrow and splenic B cells in Fl mice 
carrying the IGCR1 CBE-mutant Igm* allele express IgM” (Fig. 1c). 
Thus, mutation of the IGCR1 CBEs renders an Igh allele ineffective in 
supporting B-cell development when competing against a wild-type 
Igh allele. We found identical B-cell developmental defects in 
IGCR1*/~ and IGCR1~/~ mice (Supplementary Figs 4b, c and 5c, d). 


Mediation of diverse Igh repertoires 


We used a polymerase chain reaction (PCR) approach (Supplemen- 
tary Fig. 6a) to assay for DJ,; and Vy(D)Jy rearrangements in purified 
control, IGCR1/CBE*’~, IGCR1/CBE ‘~, IGCR1*/~ and IGCRI/~ 
bone marrow pro-B and pre-B cells, and in splenic B cells. We assayed 
for rearrangements of the two most D};-proximal Vy, families (Vi7183 
and Vyas) and the most distal Vj; family (Vyyssg). Igl Vk-to-Jk joins 
were assayed as a stage-specific control and the mouse Dig5 gene as a 
loading control. Levels of DJ and VJ« rearrangements did not vary 
markedly among different populations or genotypes; thus, V(D)J 
recombination in general was not affected by the mutations (Fig. 2a 
and Supplementary Fig. 6). However, relative levels of proximal 
Vu7is3DJx rearrangements were markedly increased and those of distal 
VuysssDJux rearrangements markedly reduced in IGCR1/CBE /~ and 
IGCR1~/~ pro-B cells, with both being intermediate in IGCRI/ 
CBE*’~ and IGCR1*’~ pro-B cells (Fig. 2a and Supplementary Fig. 6). 


H3K9ac 


a Pro-B c 
= 0.3. Vy81X 


IGCR1/_ IGCR1/ 
CBE*” CBE” 


loxP*!" loxP! WT 
5x | | 


Vj558-D), 


VyQ52.2.4 


l l 
s = 
v.79, BS - sig 7 60 
ie S| Gs 5s Gs GG 
= wsk2 5 
Vid - = = nd @ H3K4me2 
— - ~, = 
- =|. V,81X 
o——=r 
1 — oo Ge ee 2 
0.8! VyQ52.2.4 
b Rag?’ IGCRI~~ Rago” — 
2 Ss 8 ol 
«pR ORORODROROBSD 0.0 
RT +44 —+4+4+—-444+-¢4+4 —t+4+4+—-444- OF Gs 543 GG 
ac, oC a iy Rag?" mRag2"~ IGCR1~- 
v,a52 | em we 8 d Pro:Bicells: 
| i , IGCR1/ IGCRI/ 
V,,7183 | = = | loxP" WT CBE“ CBE 
H | & = -s xi 
osP( ly sixn,a52 aa ‘ie 
Aci qo Gan Ge- Gen Ge- Gen 00 oo Se or~ 


Figure 2 | IGCR1 mutations alter Vj; usage, germline transcription and 
rearrangement order. a, PCR analyses of indicated V,, family rearrangements 
in pro-B cells from indicated mice compared to a Dig5 loading control. Results 
are typical of four experiments. Bands corresponding to rearrangements to 
various J}; segments are indicated on right. b, RT-PCR analysis of indicated 
germline Vj; transcripts in three independent wild-type and IGCR1 /~ 
A-MuLV-virus-transformed Rag2 ‘~ pro-B-cell lines. N, nonspliced sense/ 
antisense; S, spliced sense. c, ChIP-qPCR analyses of H3K4me2 and H3K9ac 
histone modifications at indicated Vj; segments in 129SV Rag2‘~ (black) and 
Raga’ ~IGCRI“ (red) A-MuLV-transformed pro-B lines. The 5’ region (5’), 
gene body (G) and 3’ region (3’) of Vizgix and gene body (G) of Viqs2.2.4 Were 
analysed. Average values and standard deviations of three experiments with 
one line shown are representative of results from both. d, Semi-quantitative 
PCR analyses of direct V};-to-D rearrangements in sorted pro-B cells from 
indicated mice. The PCR assays used for panels a, b and d are diagrammed in 
Supplementary Figs 6a, 7a and 8a. 
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Within the two proximal Vy families, Vj usage was even more skewed 
towards the most D-proximal members inIGCR1 /~ pro-B cells (Sup- 
plementary Fig. 6c). Together, these findings are consistent with 
IGCR1 mutations resulting in cis-acting increases and cis-acting 
decreases, respectively, in proximal and distal Vy rearrangement. 
Given that the proximal Vj, segments contribute to a substantial frac- 
tion of Vyq(D)Jy rearrangements (about 40%) in normal pro-B cells*"', 
increased Vyy7183 joins in IGCR1/CBE*’~ and IGCR1*/~ pro-B cells 
indicates that the absolute level of V};-to-DJ,; rearrangements on 
mutant alleles, although even more biased towards proximal Vj, seg- 
ments than normal, is not decreased. In the various IGCR1 mutant pre- 
B cells and splenic IgM * B cells repertoire bias remained; although the 
extent was progressively moderated (Supplementary Fig. 6), probably 
due to cellular selection for Vj; repertoire normalization. 


Regulation of germline V}, transcription 


To measure germline Vy transcripts, we generated Rag2-deficient 
Abelson murine leukaemia virus (A-MuLV)-transformed wild-type, 
IGCR1*’~ and IGCR1-’~ pro-B lines. Rag2-deficient lines have 
unrearranged [gh alleles; thus, any detected V}; transcripts are germline. 
RNA was assayed via reverse transcriptase PCR (RT-PCR) for Vy 
expression, using one primer from the Vj; leader sequence and another 
from downstream of the RS (Supplementary Fig. 7a). On the basis of 
size, the PCR assay detects both unspliced germline Vj, transcripts 
(sense or antisense) and slightly smaller, spliced sense germline Vy; 
transcripts (Fig. 2b). Rag2 ‘~ pro-B lines had robust Dy, trans- 
cripts and spliced and un-spliced Vyjs5g transcripts, but lacked readily 
detectable Vuyos2 or Vuzis3 transcripts (Fig. 2b). However, 
Rag2 ‘~IGCRI a and, more so, Rag2 '‘IGCR1‘~ pro-B lines 
showed marked upregulation of spliced and unspliced Vygs2 and 
Vuzis3 transcripts with normal levels of Vyyssg and Dy transcripts 
(Fig. 2b and Supplementary Fig. 7d). We even detected by northern 
blotting a ~3.5-kb Vygix-hybridizing transcript in RNA from 
Rag2' IGCR1~‘~ lines, but not in wild-type Rag2~'~ lines (Sup- 
plementary Fig. 7f). Primary Rag2-/"IGCR1‘~ pro-B cells also 
strongly upregulated germline Vy71g3 transcripts (Supplementary 
Fig. 7e). Lastly, chromatin immunoprecipitation-sequencing (ChIP- 
seq) and chromatin immunoprecipitation-quantitative PCR (ChIP- 
qPCR) analyses revealed that deletion of IGCRI1 led to a marked 
increase in active histone marks over Vygix (Vq17183.22.3) and the adja- 
cent Vy1Q52.a2.4 germline gene segments (Fig. 2c and Supplementary 
Fig. 7b, c). Thus, IGCR1 suppresses activation of germline Vj; seg- 
ments over distances of at least 100 kb. 


Role in order and lineage specificity 


Weassayed for Vigix-to-germline-Dgs2 joins via PCR with a forward 
Vusix-specific primer and a reverse primer from sequences between 
Dgs2 and Jy;1 (Supplementary Fig. 8). Whereas we did not detect direct 
Vusix-to-Dgs2 joins in control pro-B cells, we readily detected them in 
IGCR1/CBE*’, IGCR1/CBE ‘~, IGCR1*’~ and IGCR1 ‘~ pro-B 
cells (Fig. 2d and Supplementary Fig. 8). Sequences of 133 independent 
direct Viq7183Dq352 joins revealed that 120 involved Vygix, 12 involved 
the downstream pseudo-Vy71s3, and one involved the next Vy7183 
upstream of Vigix (Supplementary Table 2). Therefore, integrity of 
the IGCRI1 CBEs is required for ordered Igh V(D)J recombination in 
pro-B cells, at least for proximal Vy; segments. 

To examine potential IGCRI1 roles in lineage-specific Igh V(D)J 
recombination, we assayed for D-to-Jy, Vy-to-DJy and V«-to-Jk 
rearrangements in DNA from CD4*CD8* (double-positive) thymo- 
cytes from control and IGCRI/CBE*’~, IGCR1/CBE ‘~, IGCR1*/~ 
and IGCR1 ‘~ mice (Fig. 3a and Supplementary Fig. 9). We detected 
Dgs2Ju rearrangements in all mice (Fig. 3a and Supplementary Fig. 9). 
However, whereas there were no V};(D)Jy rearrangements in controls, 
we readily detected V}(D)Jy rearrangements of proximal Vj471g3 and 
Vuas2 segments, but not distal Vyyjs53 segments, in mutant double- 
positive thymocytes (Fig. 3a and Supplementary Fig. 9). Lack of VkJ« 
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Figure 3 | IGCR1/CBE mutations lead to V}y(D)Jy and VyD rearrangements 
in thymocytes. a, PCR analyses of V}; family rearrangements in sorted double- 
positive thymocytes (DP-T) and total splenic B cells from indicated mice with 
Dig5 as a loading control. Bands corresponding to rearrangements to various J; 
segments are indicated on right. Ig rearrangement (VkJk) served as a control 
for B-cell contamination. b, Semi-quantitative PCR analyses of direct Vj,-to-D 
rearrangements in sorted double-positive T cells (DP-T cells) from indicated 
mice. Assays are diagrammed in Supplementary Figs 6a and 8a. 


rearrangements confirmed absence of B-cell contamination. Cloning 
and sequencing of Vy47183- and Vyq52-to-DJy rearrangements from 
IGCR1 ‘~ double-positive thymocytes revealed predominant utiliza- 
tion of the most proximal Vy; segments (Vygix and Vyq52.02.43 
Supplementary Tables 3 and 4). We also assayed for direct Vizg1x-to- 
germline Dgs2 joins in double-positive thymocytes (Fig. 3b and 
Supplementary Fig. 8). As expected, controls lacked detectable direct 
V}-to-D joins; but such joins were readily apparent in mutant thymo- 
cytes (Fig. 3b and Supplementary Fig. 8). Nucleotide sequencing of 32 
VuD joins revealed 29 used Vygix and the rest used the downstream 
pseudo-Vy471s3 (Supplementary Table 2). Thus, IGCR1 CBEs are 
required for lineage-specific Igh Vy-to-DJy recombination. 


Role in proximal V,; feedback regulation 

Surface staining of splenic B cells heterozygous for the IGCR1-deleted 
Igm’ allele and a wild-type Igm’ allele did not reveal allelic inclusion 
(Supplementary Fig. 4c). Likewise, no IgM*/IgM? double expressers 
were found in nearly 900 individual IGCR1*'~ F1 splenic B cells by 
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cytoplasmic staining (Supplementary Fig. 11a). Hybridoma analyses 
showed that about 60% of wild-type B cells had a productive Vjq(D)Jyq 
on one allele and a DJ;; on the other (that is, Viy(D)Jyy*/DJyy con- 
figuration) and about 40% had V},(D)Jyq rearrangements on both 
alleles (that is, Vi(D)Jun* /Viu(D)Jn configuration) (Fig. 4a). This 
60/40 ratio reflects feedback regulation of Vy,-to-DJy joining from 
productive rearrangements”®. InIGCRI1*’~ B cells, this ratio inverted 
to 30/70, demonstrating that heterozygous IGCRI deletion markedly 
increases B cells with V}(D)Jy; joins on both alleles, despite allelic 
exclusion at the protein level. Analyses of 39 Vy(D)Ju/Vu(D)Ju 
IGCR1*’~ B-cell hybridomas revealed that most had a Vi(D)y~ 
that used a distal Vy and a Vyx(D)Jy ‘that used Vyygix or a nearby 
proximal V}; (Supplementary Table 6). The skewed Vi(D)y/ 
Vy(D)Jyz ratio in IGCR1*’~ B cells can be explained by frequent 
early formation of VizgixDJy rearrangements on the mutant allele. 
Thus, VigixDJq rearrangements would exclude rearrangement of 
the wild-type allele but would be lost developmentally; leading to most 
peripheral B cells deriving from progenitors that formed productive 
V(D)J rearrangements on the wild-type allele subsequent to 
Vusix(D)J rearrangements on the mutant allele (Supplementary 
Fig. 11d). 

The extremely high representation of proximal Vj; segments (for 
example, Vyjgix) rearranged on the IGCRI1-deleted allele might mask 
allelic inclusion because productive Vy4g1x rearrangements are selected 
against cellularly'*'*"*. Therefore, to examine further potential effects of 
IGCRI1-deletion on allelic exclusion, we assayed the V(D)J /DIiy 
versus Viy(D)Jiz*/Viy(D)Jy_ ratio of IGCR1 /~ hybridomas. Because 
both Igh alleles would be similarly biased for proximal V} rearrange- 
ments inIGCR1~/~ B cells, one still would expect the 60/40 ratio if Vy4- 
to-DJ} recombination was feedback regulated (Supplementary Fig. 11e). 
However, we found an inverted ratio of 20/80 in IGCR1/~ hybrido- 
mas (Fig. 4a), strongly suggesting that IGCR1-deleted alleles escape 
feedback regulation, at least for proximal V}; segments (Supplemen- 
tary Fig. 1le). Because of the ambiguities of cellular selection against 
Vusix and the lack of allotypically marked IGCR1-deleted alleles, we 
tested for escape from feedback inhibition by assaying for endogenous 
rearrangements in peripheral B cells from mice with a productive 
Vuy(D)Ju knock-in Igh allele (VB1-8 knock-in) that was IGCRI1~ 
and a second allele that was IGCR1* or IGCRI1 . Notably, 
IGCR1*’~ VB1-8 knock-in B cells had a more than 20-fold increased 
level of Viz7183 rearrangements compared to IGCR1 */* VB1-8 knock- 
in B cells, but little if any change in the very low level rearrangement of 
distal Vj; segments (Fig. 4b). Moreover, most rearrangements in 
IGCR1*’~ VB1-8 knock-in B cells were non-productive Vygix re- 
arrangements (Supplementary Fig. 11f), consistent with a lack of sub- 
stantial allelic inclusion at the protein level in IGCR1*’~ F1 splenic B 
cells resulting from selection against Vizg1x expression (Supplemen- 
tary Figs 4c and 11a). We conclude that IGCR1 is required to allow 
feedback regulation of the most proximal Vj; segments. 
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Figure 4 | IGCR1 is required to allow feedback regulation of proximal Vy- 
to-DJ}; recombination. a, Mean percentage of splenic B cells with Viy(D) Jy 
rearrangements on both Igh alleles as determined by analyses of hybridomas 
from three independent sets of wild-type, IGCR1*/~ and IGCR1-‘/~ mice 
(Supplementary Table 5). Error bars represent standard deviation. P values 


were calculated by Student’s t-test. b, Igh Vj,(D)Jy; rearrangements in splenic B 
cells from two independent wild-type and VB1-8 knock-in (KI) mice carrying 
either a wild-type (IGCR1*/* VB1-8 KI) or an IGCR1-deleted (IGCR1*’~ 
VB1-8 KI) second allele. Bands corresponding to rearrangements to various J} 
segments are indicated on right. Dig5 is the loading control. 
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IGCR1 mediates chromosomal Igh loops 

We considered that IGCR1 might mediate Igh loops that would 
include iE and thereby modulate V(D)J recombination. The next 
CBEs downstream of IGCRI are a set of 10, about 5 kb downstream of 
the 3’ Igh regulatory region (3' Igh CBEs). To test for interactions 
between the IGCRI1 and 3’ Igh CBEs, we performed quantitative 
chromosome conformation capture (3C) assays on 129SV 
Rag2-/~IGCR1*'* and Rag2~'"IGCR1-/~ A-MuLV-transformed 
pro-B lines. These analyses revealed interaction between the IGCR1 
and 3’ Igh CBE locales in Rag2‘ IGCR1*’* pro-B lines (Fig. 5a and 
Supplementary Fig. 12a), as found in another study’’. We also found 
this interaction in double-positive thymocytes (Supplementary Fig. 13). 
Notably, this interaction was eliminated in Rag2 'IGCR1 ‘~ pro-B 
lines (Fig. 5a and Supplementary Fig. 12a). We also found interactions 
between the iEu locale and the IGCRI and 3’ Igh CBE locales in 


Rag2-‘"IGCR1*'* A-MuLV-transformed pro-B cells that were 
diminished in Rag2-‘"IGCR1 ‘~ pro-B lines (Fig. 5b and Supplemen- 
tary Fig. 12b). Lastly, we found strong interactions between the iE and 
3' Igh regulatory region, as reported for mature B cells**; but these were 
not diminished by IGCR1 deletion (Fig. 5b). These studies demonstrate 
that IGCR1 mediates formation of 300-kb iEu-containing Igh loops to 
the 3’ Igh CBE locale in pro-B lines, with iE also being directly juxta- 
posed to the IGCRI1 locale in an IGCR1-dependent manner, probably 
within the larger loop. As iEu lacks CBEs, its interactions with the 
IGCR1 locale are probably mediated, at least in part, by factors other 
than CTCF. 


Discussion 


IGCRI, through its CBEs, mediates ordered and lineage-specific Vy- 
to-DJy recombination and balances proximal versus distal Vy 
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Figure 5 | IGCR1 mediates long-distance Igh chromosomal loops. 

a, Schematic of chromosome interactions between IGCR1-containing and 3’ 
Igh CBE-containing KpnI restriction fragments in 3C assays. Interactions 
between IGCR1 and 3’ Igh CBE locales in 129SV Rag2”'~ and 
Rag2'IGCR1 /~ A-MuLV-transformed pro-B cells were quantified by real- 
time PCR (Taqman) using probe P2 (left) and probe P1 (right). b, Schematic of 
chromosome interactions between iE,1-containing KpnI restriction fragment 
and indicated KpnI restriction fragments in other Igh locales. Interactions 
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between iEu and IGCRI, iEu and 3’ Igh regulatory region (RR) locales, iEu 
and 3’ Igh CBE locales in Rag2~'~ and Rag2”'"IGCR1~/~ A-MuLV- 
transformed pro-B cells were quantified by real-time PCR using a probe (P3) 
from the iE locale. F1-F8 indicate primers used for PCR. K indicates KpnI 
sites. Red arcs indicate interactions detected in Rag2 '~ cells. The average 
association frequency of three independent 3C experiments with two 
independent A-MuLV-transformed lines from each genotype is shown with 
standard deviation indicated. 
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rearrangement. Indeed, IGCRI1 functions are required for an Igh allele 
to efficiently generate peripheral B cells. Notably, IGCR1 and its CBEs 
are not required for overall V}-to-DJy recombination levels, but 
rather to decrease relative recombination of proximal Vy segments, 
particularly Vi4s1x. Inability of the dominant Vy) to promote B-cell 
development probably leads to developmental defects associated with 
IGCRI mutations. Yet, the enigmatic Vygix is strongly conserved 
across mouse strains® and, correspondingly, has been suggested to 
have important roles in early antibody repertoires*”. Now, we find that 
IGCRI has a key role in regulating Vijg1x rearrangement. IGCR1 also 
is required to allow feedback regulation of proximal Vy-to-DJy re- 
arrangements, implicating IGCRI as a critical element for the allelic 
exclusion of Vyg1x and other very proximal Vy; segments. Our find- 
ings indicate that IGCRI allows feedback by suppressing early, un- 
ordered proximal Vj, rearrangement, providing the first evidence, to 
our knowledge, in support of a long-standing hypothesis that ordered 
Vy-to-DJ} joining provides a means of mediating allelic exclusion”®. 
However, we found no evidence for loss of feedback regulation of 
distal V}; segments, in accordance with the proposal that locus con- 
traction mediates their allelic exclusion”. 

Our findings show that IGCR1-mediated promotion of the utiliza- 
tion of Vj; segments up to several megabases distant does not involve 
alterations in distal Vj; transcription. In pro-B cells, Igh contraction 
promotes distal Vj; usage****. In the absence of certain transcription 
(for example, Pax-5 or YY1) or chromatin-modifying (for example, 
Ezh2) factors, distal Vj; transcription is unimpaired but Igh con- 
traction does not occur, diminishing distal V,; rearrangement. In such 
factor-deficient pro-B cells, transcription and rearrangement of 
proximal Vy segments does not increase**'*’, in contrast to the 
marked increases in IGCR1 ‘~ pro-B cells. This phenotypic differ- 
ence is consistent with IGCR1 normalizing Vy repertoires via 
mechanisms other than Igh contraction. We suggest that IGCR1 pro- 
motes distal Vj usage indirectly by preventing premature proximal 
Vy rearrangement via insulating functions before contraction, 
thereby preserving DJy substrates for distal Vy rearrangement. The 
location of CBEs throughout the Vy, portion of Igh led to the notion 
that recruitment of Vy segments into DJ}; recombination centres** 
subsequent to contraction is promoted via interaction of Vy and 
IGCR1 CBEs*. Owing to the dominance of proximal Vj; rearrange- 
ments on IGCR1-mutant alleles, assays for such putative IGCR1 func- 
tions require additional model systems. 

IGCRI1 CBEs suppress inappropriate transcription and rearrange- 
ment of proximal Vy, segments 100kb or more upstream. These 
suppressive functions are consistent with enhancer insulating func- 
tions of CBEs in vitro”**, which may relate to loop formation**. We 
propose that IGCR1 CBEs mediate loops with downstream 3’ Igh 
CBEs that segregate the D/Jy and Vy portions of Igh into separate 
regulatory domains during the D-to-J, rearrangement stage of B-cell 
development, blocking activity of iE or other elements beyond 
IGCRI (refs 17, 19; Supplementary Fig. 14a). Thus, inactivation of 
the IGCR1 CBEs allows transcriptional enhancing activity to extend 
to the proximal Vj segments promoting their premature rearrange- 
ment (Supplementary Fig. 14b). Notably, such activity does not 
appear to extend beyond the most proximal Vj}; segments, which 
may result from formation of new CBE-mediated loops to upstream 
Vu CBEs in the absence of IGCRI1. In DJ}-containing pro-B cells, 
IGCRI-insulating functions that prevent Vyj-to-Dy rearrangements 
must be neutralized to allow Vy-to-DJ, joining (Supplementary 
Fig. 14c). As CTCF binding to Igh CBEs does not vary with B-cell 
stage’, other factors must modulate activity of bound CTCF 
within IGCRI to allow for Igh-specific functions. Such factors might 
include CTCF modifications, interacting proteins such as cohesin**”’, 
or CBE sequence context® and orientation*®**”. In addition, other 
putative binding elements within IGCR1 may recruit proteins, 
such as YY1, that have been implicated in modulating CTCF 
function”. 


ARTICLE 
METHODS SUMMARY 


Mice. The targeting strategy and analysis of IGCR1-deleted and CBE-mutated 
embryonic stem (ES) cells is diagrammed in Supplementary Figs 2a and 3a (see 
Methods for details). The Institutional Animal Care and Use Committee of The 
Children’s Hospital (Boston, Massachusetts) approved all animal work. 

V(D)J rearrangement assays. PCR assays for D-to-Jy or Vy-to-DJy rearrange- 
ments were performed as described” (see Supplementary Table 1 for primers). 
Generation of B-cell hybridomas and V(D)J recombination analyses was per- 
formed as described®. 

RT-PCR and northern blot. RT-PCR and northern blotting assays for germline 
transcripts of Igh gene segments were performed as described” (primers for RT- 
PCR and northern blot probes are in Supplementary Table 1). 

3C. 3C assays were performed as described”. 

ChIP-seq/ChIP-qPCR assays. Assays were done as described”. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Generation of IGCR1-deleted mice. A targeting construct was designed to 
replace IGCRI1 (4.1 kb) with a NeoR gene cassette oriented in the direction from 
the D clusters to V clusters (Supplementary Fig. 2a). A 4.3-kb arm upstream of 
IGCR1 and a 2.9-kb arm downstream of IGCR1 were PCR amplified (see 
Supplementary Table 1 for primers) from TC1 embryonic stem (ES) cell DNA 
(129 strain) and cloned into the pLNTK targeting vector in the desired orienta- 
tion. The targeting construct was then electroporated into TC1 ES cells, and 
successful targeting assessed by Southern blot analyses using Stul- or Spel- 
digested genomic DNA and upstream or downstream genomic probes as outlined 
in detail in Supplementary Fig. 2a. Three independently targeted ES clones were 
subjected to adenovirus-mediated Cre deletion to remove the NeoR gene and 
injected for Rag2-deficient blastocyst complementation (RDBC)” or for germline 
transmission. 

Generation of IGCR1 CBE-mutated mice. Two 4.2-kb DNA fragments con- 
secutively located over the IGCRI region were PCR amplified (see Supplementary 
Table 1 for primers) and cloned into a pGEM-T easy (Promega) vector 
(Supplementary Fig. 3a). One fragment included CBE1 and the other included 
CBE2. PCR site-directed mutagenesis was used to introduce scrambled mutations 
of the 20-bp CBE1 and 19-bp CBE2 sites in each arm, respectively (see Sup- 
plementary Table 1 for primers). Restriction endonuclease recognition sites were 
incorporated into the mutated CBE sequences (Nhel for upstream and Spel for 
downstream arms). Then, these two DNA fragments were cloned into a targeting 
vector pLNTK as upstream and downstream arms. The targeting construct was 
electroporated into TC1 ES cells, and successfully targeted clones, including no 
mutations (/oxP insertion control) and the CBE] and 2 double mutation were 
assessed by Southern blot analyses using Stul-, Spel- or HindIII/Nhel-digested 
genomic DNA with appropriate probes (Supplementary Fig. 3b). Two indepen- 
dently targeted clones were subjected to adenovirus-mediated Cre deletion to 
remove the NeoR gene and injected for RDBC or for germline transmission. 
For RDBC, sorted double-positive T cells from chimaeras were genotyped by 
PCR and restriction enzyme digestion (see Supplementary Fig. 3c). Wild-type 
129SV and C57BL/6 mice were purchased from Jackson laboratory. Rag2- 
deficient mice on a 129 background were purchased from Taconic. 
Electrophoretic mobility shift assay. Probes were prepared by annealing com- 
plementary oligonucleotides (Supplementary Table 1). Annealed oligonucleo- 
tides were purified on 4% agarose gels and end-labelled with **P-yATP. 
Nuclear extracts were prepared from Rag2-deficient A-MuLV-transformed 
pro-B cell lines. Electrophoretic mobility shift assay (EMSA) reactions were con- 
ducted in a mixture of 5% glycerol, 150 mM KCL, 20 mM HEPES, pH 7.9, 5 mM 
MgCl, 1 mM dithiothreitol (DTT), 0.5% Triton X-100, 400 ng poly(dG-dC). 2 1g 
of nuclear extract was incubated with anti-CTCF or anti-IgG antibodies at 4°C 
for 20 min and labelled probes and/or competitor un-labelled probes were added 
to the reactions. The reactions were electrophoresised with 0.5 TBE buffer 
(89 mM Tris Base, 89 mM boric acid, 2 mM EDTA, pH 8.0) at 30 V, and visualized 
by autoradiograpy. 

V(D)J recombination assays. Genomic DNA was purified from sorted bone 
marrow pro-B (IgM B220*CD43°*), pre-B (IgM B220*CD43") cells, and 
splenic mature B (IgM‘B220'CD43~) cells, and double-positive T 
(B220 CD4*CD8"*) cells. Fivefold serial dilutions of genomic DNA (200 ng, 
40ng, 8ng) was used to perform PCR to analyse V(D)J rearrangements. 
Primers used in this assay are listed in Supplementary Table 1. Primers flanking 
exon 6 of the Dig5 gene were used as a loading control under the same conditions. 
V«-to-Jk rearrangement PCRs were performed to confirm specificity of sorted 
B-cell populations, and to exclude potential B-cell contamination during double- 
positive T-cell analysis. PCR products were gel electrophoresised and transferred 
to determine V(D)J recombination by Southern blotting using radiolabelled 
oligonucleotide probes (see Supplementary Table 1 for sequences) and visualized 
by autoradiography. 

RNA isolation and RT-PCR. Total RNA was isolated using Trizol (Invitrogen). 
One microgram of RNA was used to generate cDNA with reverse transcriptase 
Superscript III (Invitorgen) with random hexamers according to manufacturer’s 
protocols. Approximately, 1/40 of the reverse-transcription-generated cDNA was 
analysed by PCR. Primers that were used for PCR are provided in Supplementary 
Table 1. 

Intracytoplasmic staining. Intracytoplasmic staining was performed as described 
previously". Briefly, splenic B cells from F1 mice with a wild-type Igm? allele 
and an IGCRI1-deleted Igm* allele were purified by MACS paramagnetic beads 
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following the manufacturer’s protocol and stimulated for 4 days with LPS. Cells 
were fixed, permeabilized and then stained with FITC-labelled anti-mouse IgM* 
and biotin-labelled anti-mouse IgM? revealed by streptavidin-conjugated Texas 
Red. Cells were examined using a fluorescent microscope for IgM* and IgM? 
allotypic expressers. 

Hybridoma assay and Southern blot. Splenic B cells were isolated from wild- 
type, IGCR1 */~ and IGCR1'~ mice, and fused with NS1 cells after stimulation 
with 25 ng ml’ IL-4 and 500 ng ml anti-CD40 antibody for 4 days in culture. 
Hybridoma cells were plated and selected in HAT medium as previously 
described**. Genomic DNA from hybridomas was isolated and digested with 
Stul to determine V(D)J rearrangement configurations by Southern blot 
(Supplementary Fig. 11). DNA from the clones that showed Vy(D)Jy rearrange- 
ment on both alleles was subjected to PCR using an upstream Vy primer (specific 
to Vuysss: Vgs2 OF Viy7is3 Vu gene families) and a downstream J);4 primer, and 
the amplified junctions were cloned and sequenced to identify productive and 
non-productive V};(D)Jy rearrangements. 

Transgenic mice. IGCR1‘/~ mice were bred with the mutant mice harbouring 
a pre-assembled Igh VB1-8DJ,;4 allele’. B cells were purified by MACS 
paramagnetic beads from VB1-8DJ};4 knock-in mice with either wild-type 
IGCR1 (IGCRI*/* VB1-8 knock-in) or IGCRI deletion (IGCR1*/~ VB1-8 
knock-in) on the other [gh allele. Genomic DNA was isolated and V(D)J rearrange- 
ment of Vi7is3, Vugsx and Vyyssg segments were amplified as described in 
Supplementary Fig. 6a. 

ChIP. Rabbit polyclonal antibodies recognizing the following histone tail modi- 
fications were used: H3K9ac (Millipore; 07-352), H3K4me2 (Millipore; 07-030) 
and H3K4me3 (Diagenode; pAB-003-050). ChIP analysis and ChIP sequencing 
of A-MuLV-transformed pro-B cells was performed as described*’. The sequence 
reads obtained by paired-end Solexa sequencing with a read length of 76 nucleo- 
tides were mapped to the 129SV mouse reference genome. The ChIP-qPCR 
analysis was performed by quantifying the precipitated DNA on a MyiQ instru- 
ment (Bio-Rad) as described’. The amount of precipitated DNA was determined 
as percentage relative to input DNA to obtain relative enrichment compared to 
the precipitated DNA of the control Bcar3 enhancer”’. Tenfold dilutions of input 
material were used to generate a standard curve, and ChIP samples were quan- 
tified relative to input using the iQ5 software. The oligonucleotides used for real- 
time PCR analysis are shown in Supplementary Table 1. 

3C. The 3C assays were performed essentially as previously described”. Briefly, 
2 X 10’ cells were cross-linked with 1% formaldehyde for 10 min. The reaction 
was quenched with glycine (0.125 M). Cells were lysed in 10mM Tris pH.8, 
10 mM NaCl and 0.2% NP-40 followed by 15 strokes using a dounce homogen- 
izer. The resulting nuclei were washed in restriction enzyme buffer, resuspended 
with the same buffer containing 0.3% SDS, and incubated for 1h at 37°C. To 
sequester SDS, 2% Triton X-100 was added, and incubated for 1 h at 37 °C. 400 U 
KpnI was added and incubated overnight at 37 °C. KpnI was inactivated with 
1.6% SDS and incubated for 25 min at 68 °C. The samples were ligated in ligation 
buffer (50 mM Tris, 10 mM MgCl, 1% Triton X-100, 100 mM DTT and 0.1M 
ATP) with T4 DNA ligase overnight at 16 °C. The crosslinks within 3C library 
products were reversed and the DNA purified by overnight treatment with pro- 
teinase K at 65°C as per assay protocol. Quantitative real-time PCR using a 
standard curve was conducted to measure the frequency of the 3C products 
within each sample. Standard curves for 3C assays were generated using BACs 
containing the IGCRI1, iE and 3’ Igh CBE locales within the Igh locus (RP23- 
38K22, RP23-334P5 and RP24-275024) that were KpnI-digested and then reli- 
gated to generate all possible 3C products within the locus. Taqman applied real- 
time PCR was used to determine a 3C frequency by averaging the amount of 3C 
products produced for a given amplicon and dividing that value by the amount of 
loading control determined by loading control amplicon (see Supplementary 
Table 1). 
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Electrons surfing on a sound wave as a platform for 
quantum optics with flying electrons 


Sylvain Hermelin!, Shintaro Takada’, Michihisa Yamamoto??, Seigo Tarucha’*, Andreas D. Wieck*, Laurent Saminadayar'’®, 


Christopher Bauerle! & Tristan Meunier’ 


Electrons in a metal are indistinguishable particles that interact 
strongly with other electrons and their environment. Isolating and 
detecting a single flying electron after propagation, in a similar 
manner to quantum optics experiments with single photons’”, is 
therefore a challenging task. So far only a few experiments have 
been performed in a high-mobility two-dimensional electron gas in 
which the electron propagates almost ballistically**. In these pre- 
vious works, flying electrons were detected by means of the current 
generated by an ensemble of electrons, and electron correlations 
were encrypted in the current noise. Here we demonstrate the 
experimental realization of high-efficiency single-electron source 
and detector for a single electron propagating isolated from the 
other electrons through a one-dimensional channel. The moving 
potential is excited by a surface acoustic wave, which carries the 
single electron along the one-dimensional channel at a speed of 
3pmns~'. When this quantum channel is placed between two 
quantum dots several micrometres apart, a single electron can be 
transported from one quantum dot to the other with quantum 
efficiencies of emission and detection of 96% and 92%, respectively. 
Furthermore, the transfer of the electron can be triggered on a 
timescale shorter than the coherence time T,* of GaAs spin qubits®. 
Our work opens new avenues with which to study the teleportation 
of a single electron spin and the distant interaction between spa- 
tially separated qubits in a condensed-matter system. 

Quantum electron optics is a field aiming at the realization of 
photon experiments with flying electrons in nanostructures at the 
single-electron level. Important tools with which to infer complex 
photon correlations inaccessible from ensemble measurements are 
single-photon sources and single-photon detectors. In contrast with 
photons, electrons are strongly interacting particles and they usually 
propagate in a Fermi sea filled with other electrons. Each electron 
therefore inevitably mixes with the others of the Fermi sea, which 
implies that the quantum information stored within the charge or 
the spin of the single electron will be lost over short lengths. To per- 
form quantum electron-optical experiments at the single-electron 
level, one therefore needs a source of single electrons, a controlled 
propagating medium and a single-electron detector. It has been pro- 
posed that edge states in the quantum Hall effect can serve as a one- 
dimensional (1D) propagating channel for flying electrons. As a result 
of Coulomb blockade, quantum dots have been demonstrated to be a 
good source of single electrons”* and can also serve as a single-electron 
detector. Indeed, once an electron has been stored in a quantum dot, its 
presence can be inferred routinely by charge detection’. Nevertheless, 
re-trapping the electron in another quantum dot after propagation in an 
edge state turns out to be extremely difficult, and currently all the 
information extracted from such experiments is coming from ensemble 
measurements’*”’. Here we show that a single flying electron—an elec- 
tron surfing on a sound wave—can be sent on demand from a quantum 
dot by means of a 1D quantum channel and re-trapped in a second 


quantum dot after propagation. The 1D quantum channel consists of a 
depleted region several micrometres long in a two-dimensional elec- 
tron gas (2DEG). The electron is dragged along by exciting a surface 
acoustic wave (SAW) and propagates isolated from the other electrons 
inside the 1D channel’. The processes of loading and unloading of the 
flying electron from the quantum channel into a quantum dot turn out 
to be highly efficient. Moreover, we show that the transfer of the 
electron can be triggered with a timescale smaller than the coherence 
time T,* of GaAs spin qubits®. Because both electron spin directions 
are treated on the same foot in the SAW quantum channel, one expects 
that the spin coherence during the transport is conserved. Naturally, 
new possibilities will emerge to address the question of scalability in 
spin qubit systems®’*"*. 

To transport a single electron from one quantum dot to the other 
separated by a 3-um 1D channel (see Fig. 1 and Methods), the follow- 
ing procedure is applied. First, the region between the two electrodes, 
which define the 1D channel, is fully depleted. As a consequence, direct 
linear electron transport from one end of the channel to the other is 
blocked because the Fermi energy lies below the potential induced by 
the gates. Second, by applying microwave excitation to the interdigi- 
tated transducer (IDT), SAW-induced moving quantum dots are 
generated” as a result of the piezoelectric properties of GaAs (see also 
Supplementary Information). By adding a quantum dot to each side of 
the 1D channel and tuning both quantum dots into the single electron 
regime, it is then possible to transport a single electron from one 
quantum dot across the 1D channel and catch it inside the second 
quantum dot. Stability diagrams for both quantum dots as a function 


IDT 


Figure 1 | Experimental device and measurement setup. Scanning electron 
microscope image of the single-electron transfer device, and diagram of the 
experimental setup. Two quantum dots, which can be brought into the single- 
electron regime, are separated by a 1D channel 3 1m long, as shown. Each 
quantum dot is capacitively coupled to a QPC close by that is used as an 
electrometer’. By applying a microwave burst 65 ns long on the IDT (see 
Methods for details), a train of about 150 moving quantum dots is created in the 
1D channel. Gate V. is connected to a home-made bias tee to allow nanosecond 
manipulation of the dot potential. RF, radio frequency. 
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of the applied voltage on the two gates controlling the two barriers of 
the quantum dot are shown in Fig. 2a, b. They demonstrate that the 
system can be tuned into a regime consisting of few electrons’. As 
expected, the charge degeneracy lines disappear when the barrier 
height between each dot and the reservoir is increased (corresponding 
to increasingly negative voltages V;,, and V,,). This also changes the 
position of the quantum-dot minimum and brings the electron closer 
to the 1D channel, to a position where a better transfer to SAW 
quantum dots is expected. 
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Figure 2 | Stability diagrams of the two quantum dots and charge detection. 
a, b, Stability diagram of the left (a) and right (b) dot obtained via charge 
detection by varying respectively gate voltages (V, or V.) and (V,, or V.-) (see 
Fig. 1). Sweeps in V,, and V,, are fast and are performed within 1 s from +0.15 V 
to —0.15 V (3 ms per point). When the barrier height is made higher (V, or V,- 
more negative), metastable charge states with timescales longer than the V, or 
Vp sweep time are observed. In the very negative Vy part of the diagram for the 
right dot, the electrons will finally tunnel out. When the sweep direction of Vj, is 
reversed, these charge detection steps are absent. Inset to a: schematic diagram of 
the dots and channel electrostatic potential applied by the gates to the electron at 
different points in the stability diagram (see the text). c, Average QPC time trace 
along the voltage sequence of the single-electron source. Without the microwave 
burst applied on the IDT, we observe a lifetime for the metastable one-electron 
charge state of 700 ms. Applying a microwave burst, the electron in the 
metastable state is forced to quit the quantum dot with very high probability. 
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The protocol of the single-electron source for a SAW quantum 
channel is a sequence made of three dot-gate voltage steps (see 
Fig. 2a). At working point A on Fig. 2a, the left quantum dot (the 
single-electron source) is loaded with one electron on a timescale close 
to microseconds and unresolved with the setup detection bandwidth. 
It is then brought rapidly to working point B, where the chemical 
potential of the single electron state lies above the Fermi energy and 
the coupling to the 1D channel is expected to be large. The actual 
position of B is not crucial as long as the electron is sufficiently pro- 
tected from tunnelling out of the dot and the dot potential is high 
enough to facilitate the charging of the electron into the moving 
SAW dot (see inset to Fig. 2a). For each sequence, the quantum point 
contact (QPC) conductance time-trace is recorded to observe single- 
shot loading and unloading of the dot. This sequence is repeated 1,000 
times to obtain measurement statistics; the resulting averaged time- 
traces are shown in Fig. 2c. An exponential decay of the presence of the 
electron in the dot as a function of the time spent at working point B is 
observed in the experimental data, corresponding to a tunnelling time 
close to 1s as indicated by the green line. This gate pulsing sequence is 
then repeated by adding a burst of microwaves to the IDT with a pulse 
length of several tens of nanoseconds, applied 100 ms after the system 
is brought into position B. The microwave burst creates a moving 
quantum dot, which lifts the electron, initially trapped in the left 
quantum dot, above the tunnel barrier and drags it out of the quantum 
dot. This results in a jump in the QPC current, as shown by the red line. 

To demonstrate that the electron has been loaded into a moving 
quantum dot and not expelled into the reservoir, it is essential to detect 
the coincidence between events when the electron leaves the single- 
electron source (left dot) and when it is trapped in the single-electron 
detector (right dot). This is realized by a second voltage pulse sequence 
on the right dot: when the single-electron source is brought in position 
B, the detector dot is armed by pulsing its gates to working point B’, 
where the steady state is the zero-electron state and the coupling to the 
channel is large. At this working point both QPC traces are recorded 
simultaneously. No charge variation is observed during the first 50 ms 
where the system is kept in position B. A microwave pulse is sent with a 
time lag of 50 ms. After the recording, the detector is reinitialized to 
zero electron at working point A’, where the captured electron can 
tunnel efficiently into the reservoir. Typical single-shot readout curves 
are presented in Fig. 3a—d. Coincidences are observed between events 
when an electron leaves the source quantum dot and an electron is 
detected in the receiver quantum dot within the same time slot 
(Fig. 3a). These events correspond to the situation in which one elec- 
tron has been loaded in the electron source (left dot), is then trans- 
ferred in the quantum channel (the moving quantum dots) and is 
received in the detector (right dot). In contrast with photon detectors, 
here the electron still exists after detection. A set of experiments 
described in Fig. 3 allows the full characterization of the high quantum 
efficiency of both the single-electron source and the single-electron 
detector observed in the experiment: 96% for the single-electron 
source and 92% for the single-electron detector (see Fig. 3e). 

In quantum dots it is possible to load not just one but two electrons. 
By waiting long enough", the two electrons will be in a singlet state at 
zero magnetic field and are hence entangled in the spin degree of 
freedom. The ability to separate the two electrons and to bring only 
one of them to the second quantum dot is of potential interest for the 
transfer of quantum information and is the essence of the quantum 
teleportation protocol’’”"’. By analogy with photons, this is the equi- 
valence of a two-photon entangled source”’. Moreover, in contrast 
with a photon detector, the electron detector can discriminate easily 
whether one, two or more electrons have left the single-electron source 
and are captured in the single-electron detector (see Fig. 2a). The 
protocol consists of loading the left dot with exactly two electrons by 
moving gate voltages V,, and V-. into the two-electron regime of the 
stability diagram. The quantum dot is then tuned towards the working 
point where loading of the moving quantum dots is possible (point B). 


©2011 Macmillan Publishers Limited. All rights reserved 


a b 
1.0 39 
0.9 
{ 3.6 
0.8 
= 3.3 oO” 
= 8 
8 & 
5 : 
8 42 8 
a “> 
SG s 
3.9 
3.6 


3.3 
0 20 40 60 80 100120 0 20 40 60 80 100120 


Ti 
e ime (ms) 
SAW 
Event On Off On On On 
Nias 9,841 10,001 16 5,154 1,462 
N 9,408 0 15 4,954 1,395 
— (95.6%) (0%) (94%) (96.1%) (95.4%) 
Nioox 9,128 0 14 4,807 1,349 
N 8,393 0 14 4417 1,244 
yet (91.9%) (0%) (100%) (91.9%) (92.2%) 
IN apy 0 1 0 0 0 


Figure 3 | Coincidence between emission and detection of a single electron. 
a-d, Coincidence between the two single-shot QPC time traces at voltage 
working points B and B’ corresponding to the different events Njo0: (a); Niooo 
(b), Ni100 (c) and Noooo (d). The position in time of the RF burst is indicated by 
the black arrow. At this specific time, the small peak or dip observed on time 
traces is the result of the SAW-induced enhancement or reduction, respectively, 
of the QPC current. The notation N,z,5 corresponds to the number of events 
with « or f electrons in the source dot before or after the microwave burst, 
respectively, and to y or 6 electrons in the receiver dot before or after the 
microwave burst, respectively. When one index is replaced by x, the 
corresponding output result is disregarded. Event Njo99 corresponds to the 
situation in which the electron has been transferred from the source to the 
detector and is immediately kicked out of the detector dot by the same RF burst 
and is therefore not detected. Events for which fB + 6 >a + y are called ‘bad’ 
events. e, Summary table for the different events over 10,001 traces for different 
source dot loading probabilities (Nj,x.) with or without the RF burst. The 
loading probability can be tuned on demand by changing the voltage gate 
position A in the stability diagram around the charge degeneracy point. The 
summation at the bottom table is for (8 + «) >6+ y. 


Different possibilities for the emission of electrons into the quantum 
channel are observed. Indeed, when starting with exactly two electrons 
in the source dot, one can achieve the outcome that either exactly one 
or both electrons are emitted from the source and received in the 
detector dot, as shown by the single-shot traces for QPC detection of 
the two dots (see Fig. 4a—d). The probability of each event varies with 
the working voltage at point B. For very negative gate voltage V., about 
half of the time the two electrons are separated, meaning that only one 
electron is transferred, and the other half of the time both electrons are 
transported (see Fig. 4e). For the events in which both electrons leave 
the dot, the electrons are most probably loaded into two different 
moving quantum dots. More interestingly, when pulsing gate voltage 
V- more positively, a situation can be realized in which only one of the 
two electrons of the left dot is efficiently emitted and consequently 
captured by the right dot (see Fig. 4e). In this case, the probability of 
sending the two electrons is markedly reduced, to less than 3%, and the 
probability of effectively separating the two electrons approaches 90%. 

To use single-electron transfer in quantum operations using spin 
qubits, one has to show that coherence of the electron spin after elec- 
tron transfer is preserved. Measurement and coherent manipulations 
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Figure 4 | Coincidence between emission and detection of two electrons and 
triggered nanosecond electron transfer. a—d, Coincidence between the two 
single-shot QPC time traces at voltage working points B and B’ corresponding 
to the different events N2109 (a), N2101 (b), N2001 (¢) and N2902 (d). e, Summary 
table of the different events over 1,005 traces for dot configurations 

V. = —0.388 V and V. = —0.322 V. f, Evolution of the number of Njo9; and 
Nioxx events as a function of the delay between the 1-ns gate pulse and the 65-ns 
microwave burst when a single electron is loaded into the single-electron 
source. g, Schematic diagram of the timing sequence between the 1-ns gate 
pulse and the microwave burst applied to the IDT. 


of electron spins can be straightforwardly implemented in our setup, 
and the spin coherence time T,* of an ensemble of electrons stored in 
SAW-assisted moving quantum dots has been shown to be as long as 
25 ns (ref. 21). A necessary condition for investigating coherent trans- 
port ofa single electron spin is to be able to trigger the electron transfer 
within a timescale that is short compared with T,*. Indeed, a micro- 
wave pulse 250ns in duration corresponds to about 700 moving 
quantum dots, and the experiments described above demonstrate 
the ability to load the electron into one of the moving quantum dots 
produced by each SAW microwave burst. We now show that the 
number of minima of the microwave burst in which the electron is 
loaded can be reduced to two. For this purpose, the single-electron 
source voltage sequence is slightly modified. After charging of the 
quantum dot, the system is brought to position B (see Fig. 2a) slightly 
on the more negative side with respect to V., and the duration of the 
microwave pulse is shortened to a minimum of 65 ns. At this voltage 
position, the barrier height to the quantum channel is increased and 
the transfer probability of an electron into the quantum channel is as 
low as 5% when excited with the SAW microwave burst. To trigger 
single-electron transfer, a 1-ns voltage pulse on V, with a positive value 
(voltage position C in Fig. 2a) is added to this sequence. In Fig. 4f the 
evolution of the number of events in which one electron leaves the 
single-electron source and one electron is detected in the single- 
electron detector (Nj001) is plotted as a function of the delay between 
the 1-ns gate pulse and the 65-ns microwave burst. High transfer 
probabilities reaching 90% are observed only for time delays of roughly 
765 ns, corresponding to the propagation time of the surface acoustic 
wave from the IDT to the dot region. Taking into account the pulse 
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length of the gate and the distance between two minima of the SAW, 
only two moving quantum dots can then be the hosts of the trans- 
ported electron during the gate pulse, as indicated schematically in 
Fig. 4g. This demonstrates the ability to load on demand and in a very 
reproducible manner one of the two minima of the train of moving 
quantum dots with a single electron during the 1-ns gate pulse. The use 
ofa faster arbitrary waveform generator should allow the electron to be 
loaded on demand into the same moving quantum dot. 

These experiments represent the first milestone on the road to a new 
experimental platform for realizing quantum optics with flying electrons 
implemented in gated 2DEG heterostructures and transported by sur- 
face acoustic waves. High quantum efficiency of both the single-electron 
detector and the single-electron source are shown and potentially enable 
the measurement of all moments of the electron correlations”. In com- 
parison with other implementations in similar systems, the propagating 
electron is physically isolated from the other conduction electrons of the 
heterostructure. In bringing together two propagating quantum buses 
separated by a tunnel barrier, a beam splitter for flying electrons can be 
implemented”*** and Hanbury Brown and Twiss-type experiments in 
which there are stronger Coulomb interactions between electrons could 
be realized. Future experiments should allow coherent spin transfer and 
provide new insight into the feasibility of quantum teleportation proto- 
cols and on the potential scalability of spin qubits. 


METHODS SUMMARY 


The device is defined by Schottky gates in an n-AlGaAs/GaAs 2DEG-based 
heterostructure (the properties of the 2DEG are as follows: u~ 10°cm? V's‘, 
n,~ 1.4% 10''cm ’, depth 90nm) with standard split-gate techniques. The 
charge configuration of both dots is measured by means of the conductance of 
both QPCs by biasing it with a direct-current voltage of 300 'V; the current is 
measured with a current-to-voltage converter with a bandwidth of 1.4kHz. The 
voltage on each gate can be varied on a timescale down to microseconds. In 
addition, the gate biased with voltage V., controlling the coupling between the 
left dot and the 1D channel, is connected to a homemade bias tee to allow nano- 
second manipulation of the dot potential by means of an arbitrary function 
generator (Tektronix AWG 5014). The IDT, which is placed about 2 mm to the 
left of the sample, is made of 70 pairs of lines 70 um in length and 250 nm in width 
with a 1-1m spacing. The IDT is orientated perpendicular to the direction of the 
1D channel defined along the crystal axis [110] of the GaAs wafer; it has a fre- 
quency bandwidth of about 20 MHz. 
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On-demand single-electron transfer between 


distant quantum dots 


R. P. G. McNeil’, M. Kataoka!?, C. J. B. Ford!, C. H. W. Barnes', D. Anderson!, G. A. C. Jones’, I. Farrer’ & D. A. Ritchie! 


Single-electron circuits of the future, consisting of a network of 
quantum dots, will require a mechanism to transport electrons 
from one functional part of the circuit to another. For example, 
in a quantum computer’ decoherence and circuit complexity can 
be reduced by separating quantum bit (qubit) manipulation from 
measurement and by providing a means of transporting electrons 
between the corresponding parts of the circuit”. Highly controlled 
tunnelling between neighbouring dots has been demonstrated**, 
and our ability to manipulate electrons in single- and double-dot 
systems is improving rapidly**. For distances greater than a few 
hundred nanometres, neither free propagation nor tunnelling is 
viable while maintaining confinement of single electrons. Here we 
show how a single electron may be captured in a surface acoustic 
wave minimum and transferred from one quantum dot to a 
second, unoccupied, dot along a long, empty channel. The transfer 
direction may be reversed and the same electron moved back and 
forth more than sixty times—a cumulative distance of 0.25 mm— 
without error. Such on-chip transfer extends communication 
between quantum dots to a range that may allow the integration 
of discrete quantum information processing components and 
devices. 

Our device consists of two quantum dots connected by a long 
channel (Fig. 1A). Negative voltages applied to patterned metal surface 
gates deplete a two-dimensional electron gas that lies 90 nm below the 
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Figure 1 | Device, initialization and single-electron transfer. A, Scanning 
electron micrograph of device. Voltages applied to gates (light grey) create 
quantum dots (dashed circles) connected by a 4-tym channel. Applying a 
microwave (RF) pulse to the left- and right-hand transducers (placed 1 mm 
from the device) generates SAW pulses that trap and transport electrons. 

B, Schematic of the potential between the LQD and the RQD during 
initialization of the LQD with one electron (le) (a—d) and then the RQD with 
no electrons (0e) (e-h). C, Change in detector conductance when SAW pulse 
(*) is applied to the system set up as in B, h. The empty RQD is populated when 
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surface. The voltages are chosen such that the potential of the system is 
above the Fermi energy, and in thermal equilibrium the dots and 
channel contain no electrons. 

The quantum dots are adjusted by the two plunger and barrier gates. 
Each plunger raises and lowers the corresponding dot and each barrier 
controls the degree of isolation between that dot and the neighbouring 
reservoir. Charge in each quantum dot is detected by its effect on the 
conductance of high-resistance constrictions’ on the other side of a 
narrow ‘separation’ gate. A single electron can be initialized in one 
quantum dot (Fig. 1B, d) and then transferred at will to the other dot 
using a short burst of surface acoustic waves (SAWS). Ina piezoelectric 
material (such as GaAs), SAWs create a moving potential modulation 
that can trap and transport electrons. The transferred electron can be 
returned using a second SAW pulse travelling in the opposite direction, 
giving two-way transfer. 

Initialization of the dots is shown schematically in Fig. 1B. To set up 
an occupied left-hand quantum dot (LQD), the left-hand barrier gate 
(LBG) and left-hand plunger gate (LPG) are lowered to populate the 
LQD (Fig. 1B, a); the LBG is raised, isolating the LQD from the reservoir 
(Fig. 1B, b); and the LPG is raised to depopulate the dot selectively, 
leaving one electron (Fig. 1B, c) or more if desired (Supplementary 
Fig. 1). The LBG and LPG can then be stepped to their final voltages 
(Fig. 1B, d). The dot now contains a chosen number of electrons held 
close to, but below, the channel potential. An empty dot is initialized in 
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Device B Electron shifts as RQD adjusted for return 
the electron leaves the LQD. The second pair of traces shows a control case in 
which the LQD starts empty (0e) (traces are 1s long). D, Single-electron rally: 
the quantum dots and the channel are initialized to be empty before time A. 
Between time A and time B, a series of control pulses is used to verify that 
system is empty. At time B an electron is loaded into LQD. Between time B and 
time C, there is two-way transfer of a single electron between the quantum dots. 
At time C, the electron is removed from the system using a clearing pulse. The 
SAW pulse duration is 300 ns. The small step marked “}’ is a random switching 
event and is not SAW driven. The time between traces is not plotted. 
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a similar way but with the plunger gate being raised first (Fig. 1B, e-h). 
The final voltages for both the empty and occupied quantum dots 
(Fig. 1B, d and h) are the same and, thus, detector conductance indi- 
cates the number of electrons in each dot (Supplementary Information). 

On-demand depopulation of an initialized quantum dot is achieved 
by a brief SAW pulse. Applying a microwave signal to the left-hand 
transducer generates a SAW. The accompanying potential modulation, 
moving at 2,870 ms - captures the electron from the LQD and trans- 
fers it in 1.4 ns to the right-hand quantum dot (RQD). Part a of Fig. 1C 
shows the conductance of the left- and right-hand detectors for an 
occupied LQD (le) and an unoccupied RQD (0e) when a SAW pulse 
(300 ns long) is sent from the left (SAW(L)). The transfer of charge is 
shown by simultaneous step changes in the detector conductance. 

We know that the quantum dots are not simply exchanging elec- 
trons with their neighbouring reservoirs (in the direction opposite to 
that of SAW propagation) during the SAW pulse sequences, because in 
the control case, with an empty starting dot, no change in detector 
conductance is seen (Fig. 1C, b)). Itis possible that electrons are instead 
being transferred by means of a “‘Newton’s cradle’ arrangement, 
whereby an electron from one dot moves into the channel, causing a 
series of electrons in traps along the channel to ‘shuffle up’, ejecting the 
last electron into the second dot. However, the SAW amplitude is 2.5 
times greater than that at which electrons are caught in the channel, so 
there are no electrons to be shuffled along. Thus, in part a of Fig. 1C a 
single electron is being transferred between the dots. 

The two transducers allow for bidirectional transfer between the 
quantum dots, and single electrons (or pairs) can be sent backwards 
and forwards in bursts (as in a game of ‘ping-pong’) with ‘rallies’ 
comprising tens to hundreds of SAW pulses. Figure 1D is an example 
of such a single-electron rally. Both quantum dots are emptied before 
time A, and six control pulses (three SAW(L)-SAW(R) pairs) show the 
system to be empty. At time B, an electron is loaded into the LQD. The 
electron is then sent back and forth by ten alternating SAW pulses (five 
pairs) until at time C the RQD barrier is partly lowered and a ‘clearing’ 
pulse removes the electron from the channel—in this case to the right- 
hand reservoir but potentially into the next section of a quantum dot 
circuit. The small step in the right detector signal (7) is a random 
switching event near the detector. It is not coincident with the SAW 
pulse but occurs 50 ms later. No further electron movement is seen in 
the subsequent ten pulses. 

In this device, rallies of over 60 pulses were possible with a single 
electron going back and forth between the quantum dots. A run of 35 
transfers is shown in Fig. 2a, and the statistics of the full data set are 
shown in Fig. 2b. Rallies are broken when the transfer fails, which can 
occur in one of two ways. Occasionally, depopulation of the starting 
dot fails (marked F in Fig. 2b), in which case no electron arrives in the 
second dot. The chances of this can be reduced by increasing the 
potential of the starting dot, towards that of the channel, or by increas- 
ing the SAW amplitude, although larger SAWs can pose problems, for 
example by lifting the transferred electron over the barrier of the 
second dot. Given successful depopulation, transfer may still fail if 
the electron becomes trapped in the channel (marked T). This type 
of failure was more common in pulses from the weaker, right-hand, 
transducer and examples can be seen in Fig. 2a (also marked T). Here a 
SAW(R) pulse fails to transfer the electron all the way to the LQD, 
leaving it trapped in the channel. However, the next pulse from the 
other transducer recovers this electron, returning it to the RQD. The 
probability of recovery (marked R) is lower than the probability of 
transfer (marked S), indicating that electrons trapped in the channel 
may relax deeper into impurity traps than electrons that are carried 
through in SAW minima. This second type of error can also occur in 
another way (X, not shown), described later, but this can be eliminated 
by lowering the potential in the second dot. 

A third error mechanism (marked E in Fig. 2b) is the arrival of an 
additional electron, which is then transferred with the initial electron. 
Electrons are seen to enter the system during pulses from the right-hand 
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Figure 2 | Single-electron transfer reliability. a, Example of bidirectional 
electron transfer. An electron is transferred between the quantum dots 35 times 
before getting trapped in the channel (T). The next SAW(L) pulse recovers the 
electron (R). The SAW pulse duration is 300 ns, and the time between traces is 
not plotted. b, Transfer statistics for full data set (excerpt seen in a), showing 
probabilities of various events for SAW(L) and SAW(R): ideal transfer (S), 
depopulation to channel trap (T), recovery from channel (R), failure to 
depopulate (F), arrival of additional electron (E), loss of electron from system 
(L). Values in parentheses are for different voltages. 


transducer that may have been caused or exacerbated by adjusting the 
RQD before the SAW(R) pulses started. No electrons appeared in the 
system during SAW(L) pulses. Increasing the isolation of the quantum 
dots and the channel from the surrounding reservoirs will reduce this. 
In none of the traces is the electron seen to leave the system (marked L). 

The ability of SAWs to transport electrons depends on the SAW 
amplitude relative to the potential'®’’. Removing an electron from the 
starting dot requires a SAW of sufficient amplitude to overcome the 
sloping potential and lift the electron into the channel. If the SAW 
amplitude is too large, it will carry the electron over the far barrier and 
out of the second dot. Thus, there is a practical limit to the SAW 
amplitude for a given barrier-plunger combination, and for small- 
amplitude SAWs the dot needs to be raised towards the channel poten- 
tial. Figure 3a shows the mean initial population of the LQD and 
Fig. 3b shows how depopulation changes with SAW power and 
plunger voltage (V,pq). The potential gradient between the LQD 
and the channel decreases as Vipg increases, allowing smaller- 
amplitude SAWs with a shallower gradient to lift electrons from the 
dot. Thus, the onset of depopulation occurs along a diagonal line, 
between the dashed lines in Fig. 3b, and depopulation of the deeper 
dots requires larger-amplitude SAWs. 

The pulse width of the SAWs may be varied instead of the power. It 
has previously been shown’* that a SAW can be used to modulate the 
barriers to an isolated dot, causing population and depopulation of the 
dot in a probabilistic process that requires many cycles to ensure a 
depopulation probability of >50%. Figure 3c shows how SAW pulse 
width, that is, the number of attempts or SAW minima, affects 
depopulation of the LQD. 

Applied pulses are not reproduced exactly in the SAW pulses owing 
to bandwidth limitations of the transducers; pulses longer than 14 ns 
should vary only in duration and not in peak amplitude. At a pulse 
width of 10.0 ns (27.7 cycles), the reduction in pulse amplitude due to 
transducer bandwidth is visible at the lower plunger voltages, where 
electrons cannot be depopulated. At 12.6ns (34.9 cycles), just ~7 
cycles more, depopulation is seen across almost the full range, and, 
as expected, at 14.5 ns the SAW is able to remove electrons over the 
same range as pulses of much longer width. From the rapid onset as the 
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Figure 3 | Dependence of LQD depopulation on SAW power and pulse 
width. a, Initial population of LQD as a function of plunger voltage (Vipg). The 
same greyscale key is used in b and c to indicate the number of electrons removed 
by the pulse. b, Depopulation of the LQD at different SAW(L) transducer powers; 
pulse width, 100 ns. The relative slope of the SAW (maximum slope as indicated) 
and the potential determine whether depopulation occurs. c, Depopulation of the 
LQD at different pulse widths for a SAW(L) power of 11 dBm. Pulse widths 
shorter than 14.5 ns do not achieve full amplitude, so depopulation fails at smaller 
values of Vip. By comparison with b, the peak SAW powers at pulse widths of 10 
and 12.6 ns can be estimated as 7 and 10 dBm, respectively. Pulse widths are 
measured values of the microwave source and not linearly spaced. 


pulse width increases, with depopulation going from approximately 
zero to complete in just 12.5 cycles, we can say that once a sufficient 
SAW amplitude is reached, depopulation occurs during the first few 
(~7) cycles of the pulse. Pulses applied to a transducer with a wider 
bandwidth (fewer fingers) would have shorter rise times, allowing this 
to be probed further. 

This system also provides a method of investigating energy loss 
mechanisms for electrons above the Fermi energy. AsaSAW minimum 
transfers an electron, it lifts it over ‘bumps’ in the potential, raising and 
lowering its potential energy as necessary. However, when the potential 
gradient exceeds the maximum SAW gradient, confinement is lost and 
a ‘hot’ electron escapes backwards towards the channel (Fig. 4a). The 
energy at which this occurs depends on the underlying potential. 
Figure 4b shows how varying the right-hand barrier voltage (Vga) 
affects the escape probability and the initial energy of escaping elec- 
trons. An electron starts in the RQD and a long (300-ns) SAW pulse is 
sent from the left. Electrons escaping the SAW potential at a low energy 
will remain in the dot (Z in Fig. 4a), at higher energies they will escape to 
the channel (Y), and at energies above the channel maximum they will 
reach the LQD (X). During a pulse, an X or Y electron may be returned 
to the RQD and ‘recycled’, with its ultimate position (LQD, trapped in 
channel, RQD) being determined during the last part of the SAW pulse 
as the amplitude drops. For Vpgg > —1.2 V, transfer to the channel is 
unlikely, no electrons are transferred to the LQD and the probability 
of staying in the RQD (Z) is >90%. For Vang < —1.3 V, the probability 
of leaving the RQD (X or Y) increases to >50% and the probability of 
escaping to the LQD (X) reaches 25%. In Fig. 4b, open symbols are for a 
less negative plunger voltage and show a reduced probability of transfer 
from the RQD because this dot is correspondingly deeper. 

Electrons with a large excess of energy rapidly lose energy by emitting 
an optical phonon (of energy 36 meV) in about 1 ps (ref. 13), com- 
parable to the time taken by an electron to cross one quantum dot. 
Electrons with energies less than 36 meV can emit acoustic phonons 
only with typical energies =0.1 meV, and emit these phonons more 
slowly. In the low-energy limit, this is on a 100-ns timescale'*. The 
addition of a gate across the centre of the channel, capable of being 
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Figure 4 | Backscattering of electrons in the RQD due to SAW(L). a, An 
electron in the RQD will be lifted up the right-hand barrier by SAW(L) until it 
either leaves the system or the underlying potential becomes too steep for the 
SAW minimum to retain the electron. Electrons remain in the dot (Z), escape to 
traps in the channel (Y) or escape to the LQD (X). b, Probabilities of events X, X 
or Y, and Z as functions of barrier voltage. (Open symbols are for a slightly 
deeper RQD potential (Vppg)) A threshold is evident at around — 1.3 V: for 
Veg < —1.3 V, escape to the LQD is possible, whereas for Vaz > —1.3 V, escape 
to the LQD prevented by the channel potential. Error bars, 1 s.d. 


pulsed at high frequencies, would provide a method of investigating 
emission of acoustic phonons by high-energy electrons. 

This source of high-energy electrons may be of use in p—n junction 
devices as a way to controllably introduce single electrons into a region 
of holes as a single-photon source’, without requiring negatively 
charged gates in close proximity to the holes. 

To be useful in a quantum information circuit, the transfer of an 
electron must not cause its spin state to decohere. Coherent transfer ofa 
collection of spins has been demonstrated over a distance of 70 [1m (for 
a particular wafer orientation), with the potential to extend this much 
further'®; and coherent oscillations of charge have been shown over a 
submicrometre distance’”. Fluctuations in the magnetic field created by 
nuclear spins (By,,,) are the main cause of dephasing in static quantum 
dots; however, an electron trapped in a moving SAW quantum dot 
samples many different local Byuc fields, spending only a brief time 
in each. The average By,,. and, hence, dephasing, is reduced by three 
orders of magnitude owing to the motion of the SAW (more details of 
dephasing mechanisms are given in Supplementary Information). It is 
therefore likely that coherent transfer of spins is achievable and that 
dephasing will actually be suppressed during transfer. 

In an ideal quantum dot network, with a perfectly smooth potential, 
an electron could simply be allowed to ‘roll’ from an elevated starting 
dot down to the second dot. In practice, the potential is far from perfect 
and irregularities in the background potential would make this method 
of transfer highly unreliable. A pulse of SAWs, however, can be used to 
modulate the channel temporarily, assisting the transfer in a peristalsis- 
like movement, the amplitude of which can be tuned to the minimum 
required to overcome desired obstacles, allowing on-demand removal 
and delivery of single electrons between distant quantum dots in a 
manner that should be compatible with many of the quantum com- 
puting proposals based on electronic spin states in semiconductors. 


METHODS SUMMARY 


The two-dimensional electron gas was formed at the interface of a GaAs/AlGaAs 
heterostructure. Before depletion, the carrier density was 1.6 X 10'! cm’? and the 
carrier mobility was 1.8 x 10°cm? V~'s~!. We made several devices; results for 
devices B and C are reported here. Devices and transducers were patterned by 
electron-beam lithography. All measurements were made at 300mK. Radio- 
frequency signals were applied to transducers using an Agilent 8648D source 
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(external modulation option). To prevent Bragg reflections, transducers were of 
double-element design’’, with 30 pairs of fingers. The detector circuits shared a 
common source with a ~1-mV d.c. bias. In device B, the position of the RQD was 
adjusted between the capture and transfer positions to aid depopulation by the 
weaker, right-hand, transducer. This adjustment shifted the dot minimum relative 
to the right-hand detector, making the return steps smaller. The gate set-up time 
between traces was 2-8s. The applied radio-frequency power in Fig. 1d was 
10 dBm for SAW(L) and 18 dBm for SAW(R), and the attenuation from the source 
to the transducers was 10 dB for SAW(L) and 20-30 dB for SAW(R). 
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Bioinspired self-repairing slippery surfaces with 
pressure-stable omniphobicity 
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Creating a robust synthetic surface that repels various liquids 
would have broad technological implications for areas ranging 
from biomedical devices and fuel transport to architecture but 
has proved extremely challenging". Inspirations from natural non- 
wetting structures” °, particularly the leaves of the lotus, have led to 
the development of liquid-repellent microtextured surfaces that 
rely on the formation of a stable air-liquid interface’°. Despite 
over a decade of intense research, these surfaces are, however, still 
plagued with problems that restrict their practical applications: 
limited oleophobicity with high contact angle hysteresis’, failure 
under pressure’”'* and upon physical damage’”"’, inability to 
self-heal and high production cost’. To address these challenges, 
here we report a strategy to create self-healing, slippery liquid- 
infused porous surface(s) (SLIPS) with exceptional liquid- and 
ice-repellency, pressure stability and enhanced optical trans- 
parency. Our approach—inspired by Nepenthes pitcher plants'*— 
is conceptually different from the lotus effect, because we use nano/ 
microstructured substrates to lock in place the infused lubricating 
fluid. We define the requirements for which the lubricant forms a 
stable, defect-free and inert ‘slippery interface. This surface out- 
performs its natural counterparts’ ° and state-of-the-art synthetic 
liquid-repellent surfaces*?’*"'* in its capability to repel various 
simple and complex liquids (water, hydrocarbons, crude oil and 
blood), maintain low contact angle hysteresis (<2.5°), quickly 
restore liquid-repellency after physical damage (within 0.1-1s), 
resist ice adhesion, and function at high pressures (up to about 
680 atm). We show that these properties are insensitive to the pre- 
cise geometry of the underlying substrate, making our approach 
applicable to various inexpensive, low-surface-energy structured 
materials (such as porous Teflon membrane). We envision that 
these slippery surfaces will be useful in fluid handling and trans- 
portation, optical sensing, medicine, and as self-cleaning and anti- 
fouling materials operating in extreme environments. 

The cutting edge in development of synthetic liquid-repellent sur- 
faces is currently inspired by the lotus effect’: water droplets are sup- 
ported by surface textures on a composite solid-air interface that enables 
them to roll off easily’”’*. However, this approach, while promising, 
suffers from inherent limitations that severely restrict its applicability. 
First, trapped air is a largely ineffective cushion against organic liquids or 
complex mixtures that, unlike water, have low surface tension, which 
strongly destabilizes suspended droplets’’. Moreover, the air trapped 
within the texture cannot stand up to pressure, so that liquids, particu- 
larly those with low surface tension, can easily penetrate the texture 
under even slightly increased pressures or upon impact'®, conditions 
commonly encountered with driving rain or in underground transport 
pipes. Furthermore, synthetic textured solids are prone to irreversible 
defects arising from mechanical damage and fabrication imperfec- 
tions’'': because each defect enhances the likelihood of the droplet 
pinning and sticking in place, textured surfaces are not only difficult 
to optimize for liquid mobility but inevitably stop working over time as 


irreparable damage accumulates. Recent progress in pushing these limits 
with increasingly complex structures and chemistries remains out- 
weighed by substantial trade-offs in physical stability, optical properties, 
large-scale feasibility, and/or difficulty and expense of fabrication®?'*"”. 

Nature, however, offers a remarkably simple alternative idea that 
has nothing to do with the lotus effect yet again capitalizes on micro- 
textures: instead of using the structures to repel impinging liquids 
directly, systems such as the Nepenthes pitcher plant use them to 
lock-in an intermediary liquid that then acts by itself as the repellent 
surface’. Well-matched solid and liquid surface energies, combined 
with the microtextural roughness, create a highly stable state in which 
the liquid fills the spaces within the texture and forms a continuous 
overlying film’°. In pitcher plants, this film is aqueous and effective 
enough to cause insects that step on it to slide from the rim into the 
digestive juices at the bottom by repelling the oils on their feet”’. 

Inspired by this idea, we report synthetic liquid-repellent surfaces— 
which we name ‘slippery liquid-infused porous surface(s)’ (SLIPS)— 
that each consist of a film of lubricating liquid locked in place by a 
micro/nanoporous substrate (Fig. la). The premise for our design is 
that a liquid surface is intrinsically smooth and defect-free down to the 
molecular scale; provides immediate self-repair by wicking into 
damaged sites in the underlying substrate; is largely incompressible; 
and can be chosen to repel immiscible liquids of virtually any surface 
tension. We show that our SLIPS creates a smooth, stable interface that 
nearly eliminates pinning of the liquid contact line for both high- and 
low-surface-tension liquids, minimizes pressure-induced impalement 
into the porous structures, self-heals and retains its function following 
mechanical damage, and can be made optically transparent. 

We designed the SLIPS based on three criteria: (1) the lubricating 
liquid must wick into, wet and stably adhere within the substrate, (2) 
the solid must be preferentially wetted by the lubricating liquid rather 
than by the liquid one wants to repel, and (3) the lubricating and 
impinging test liquids must be immiscible. The first requirement is 
satisfied by using micro/nanotextured, rough substrates whose large 
surface area, combined with chemical affinity for the liquid, facilitates 
complete wetting by, and adhesion of, the lubricating fluid (Sup- 
plementary Fig. 1)*”*. To satisfy the second criterion—the formation 
of a stable lubricating film that is not displaced by the test liquid 
(Fig. 1b)—we determine the chemical and physical properties required 
for working combinations of substrates and lubricants. We compare 
the total interfacial energies of textured solids that are completely 
wetted by either an arbitrary immiscible liquid (E,), or a lubricating 
fluid with (E,) or without (£2) a fully wetted immiscible test liquid 
floating on top of it. To ensure the solid is wetted preferentially by 
the lubricating fluid one should have AE, =E,—£,>0 and 
AE,=E,—E,>0. The equations can be expressed as (see 
Supplementary Discussion)**: 


AE, = R(ypcosOp - yacos0,) — yan > 0 (1) 


School of Engineering and Applied Sciences, Wyss Institute for Biologically Inspired Engineering and Kavli Institute for Bionano Science and Technology, Harvard University, Cambridge, Massachusetts 
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Figure 1 | Design of SLIPS. a, Schematics showing the fabrication of a SLIPS 
by infiltrating a functionalized porous/textured solid with a low-surface- 
energy, chemically inert liquid to form a physically smooth and chemically 
homogeneous lubricating film on the surface of the substrate (see Methods 
Summary). b, Comparison of the stability and displacement of lubricating 
films on silanized and non-silanized textured epoxy substrates. Top panels 
show schematic side views; bottom panels show time-lapse optical images of 


AE, = R(yscos0p - yacos0a) + Ya- 7p > 0 (2) 
where y, and yp are the surface tensions for the test liquid to be 
repelled and the lubricating fluid, yap is the interfacial tension at the 
liquid-liquid interface, 04 and 03 are the equilibrium contact angles of 
the immiscible test liquid and the lubricating fluid on a flat solid 
surface, and R is the roughness factor (the ratio between the actual 
and projected surface areas of the textured solids”). 

From these principles, we fabricated a set of SLIPS designed to 
repel liquids spanning a broad range of surface tensions. To generate 
roughness, we tested two types of porous solids, periodically ordered 
and random: arrays of nanoposts functionalized with a low-surface- 
energy polyfluoroalkyl silane**, and a random network of Teflon nano- 
fibres distributed throughout the bulk substrate, respectively (Fig. 1c). 
For the lubricating film, we chose low-surface-tension perfluorinated 
liquids (for example, 3M Fluorinert FC-70, yg = 17.1mN m ‘; or 
DuPont Krytox oils) that are non-volatile and are immiscible with both 
aqueous and hydrocarbon phases and therefore able to form a stable, 
slippery interface with our solid substrates (that is, AE, > 0 and AE, > 0) 
for a variety of polar and non-polar liquids including water, acids and 
bases, alkanes, alcohols and ketones (Figs 1d and 2a, b). The SLIPS were 
generated through liquid imbibition into the porous materials”, result- 
ing in a homogeneous and nearly molecularly smooth surface with a 
roughness of about 1 nm (Supplementary Fig. 2). 

Each of these SLIPS exhibits extreme liquid repellency as signified by 
very low contact angle hysteresis (A@ < 2.5°, Fig. 2b) and by very low 
sliding angles (« = 5° for droplet volume = 2 1l; Supplementary Fig. 3) 
against liquids of surface tension ranging from ~17.2+0.5mNm ' 
(n-pentane) to 72.4 + 0.1mN m ' (water). Contact angle hysteresis 
(that is, the difference between the advancing and receding contact 
angles of a moving droplet), and sliding angle (that is, the surface 
tilt required for droplet motion) directly characterize resistance to 
mobility”; the low values therefore confirm a lack of pinning, consist- 
ent with a nearly defect-free surface’’. Based on the measured contact 
angle hysteresis and droplet volume (~4.5 ul), the estimated liquid 
retention force* on each of the SLIPS is 0.83 + 0.22 UN for n= 6. 
This performance is nearly an order of magnitude better than the 
state-of-the-art lotus-leaf-inspired omniphobic surfaces, whose liquid 
retention forces are of the order of 5 [\N for low-surface-tension liquids 
(that is, ya <25mN m ') at similar liquid volumes’. Moreover, the 
liquid-repellency of SLIPS is insensitive to texture geometry (Fig. 2b), 
provided that the lubricating layer covers the textures (Supplementary 
Fig. 4). This further confirms that liquid repellency is primarily 
conferred by the lubricating film, with the porous solid having the 
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top views. Dyed pentane was used to enhance visibility. c, Scanning electron 
micrographs showing the morphologies of porous/textured substrate materials: 
an epoxy-resin-based nanofabricated post array (left) and a Teflon-based 
porous nanofibre network (right). d, Optical micrographs demonstrating the 
mobility of a low-surface-tension liquid hydrocarbon—hexane 

(ya = 18.6 + 0.5 mN m_, volume ~3.6 jl) —sliding on a SLIPS at a low angle 
(a = 3.0°). 


secondary, but critically important, role of immobilizing the film. 
Additionally, unlike lotus-leaf-inspired omniphobic surfaces where 
contact angle hysteresis depends on liquid surface tension and 
increases dramatically upon decrease of surface tension (Fig. 2b), such 
a dependence is absent for SLIPS owing to the chemical homogeneity 
and physical smoothness of the liquid-liquid interface. 

Experiments performed in a pressurized nitrogen environment 
show that SLIPS are capable of repelling water and liquid hydrocarbons 
both at and while transitioning to a pressure of ~676 atm (the highest 
available pressure in our setup). This is equivalent to the hydrostatic 
pressure at a depth of ~7 km (Fig. 2c, Supplementary Movie 1). To 
our knowledge, the highest recorded pressure stability of a super- 
hydrophobic surface for water is ~7 atm (ref. 16). However, it is 
important to note that pressure stability for structured surfaces 
decreases drastically for liquids with low surface tension. For example, 
recent pressure stability studies of omniphobic surfaces based on 
impacting hexadecane droplets and evaporating octane droplets 
demonstrated stability up to only 400 to 1,400Pa (4x10 ° 
1.410 *atm)*"°. Whereas the reported omniphobic surfaces fail 
upon dynamic impact of low-surface-tension liquids’®, SLIPS repel 
impacting droplets for a wide assortment of liquid hydrocarbons 
(Supplementary Fig. 5). 

The lubricating film also serves as a self-healing coating to rapidly 
restore the liquid-repellent function following damage of the porous 
material by abrasion or impact. The fluidic nature of the lubricating 
layer means that the liquid simply flows towards the damaged area by 
surface-energy-driven capillary action’, and spontaneously refills the 
physical voids. As observed by high-speed camera imaging, the mea- 
sured self-recovery time for a ~50-j1m fluid displacement of the FC-70 
lubricating layer on an epoxy-resin-based SLIPS is ~ 150 ms (Fig. 3a)". 
Even more impressively, SLIPS can repeatedly restore their liquid- 
repellent function upon recurring, large-area physical damage 
(Fig. 3b, Supplementary Fig. 6 and Supplementary Movie 2). 

We further demonstrate that, by choosing substrate and lubricant 
materials with matching refractive indices, SLIPS can be engineered 
for enhanced optical transparency in visible and/or near-infrared 
wavelengths (Fig. 3c—e). Optical transparency is challenging to achieve 
through superhydrophobic surfaces, because they require nanostructures 
with dimensions under the sub-diffraction limit (<~100 nm)”; the 
large difference in refractive index at the solid—air interface of these 
structured surfaces results in significant light scattering that reduces 
light transmission (Fig. 3c-e). 

In addition to repelling liquids in their pure forms, SLIPS effec- 
tively repel complex fluids, such as crude oil (Fig. 4a, Supplementary 
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Figure 2 | Omniphobicity and high-pressure stability of SLIPS. a, 
Time-sequence images comparing mobility of pentane droplets 

Ya =17.2+0.5mNm |, volume ~30 ul) ona SLIPS and a 
superhydrophobic, air-containing Teflon porous surface. Pentane is repelled 
on the SLIPS, but it wets and stains the traditional superhydrophobic surface. 
b, Comparison of contact angle hysteresis as a function of surface tension of test 
liquids (indicated) on SLIPS and on an omniphobic surface reported in ref. 9. In 
the inset, the advancing and receding contact angles of a liquid droplet are 
denoted as Oaay, and 0;.-, respectively. SLIPS 1, 2 and 3 refer to the surfaces 
made of Teflon porous membrane (SLIPS 1), an array of epoxy posts of 


Pressure (atm) 


geometry 1 (pitch ~2 um, height ~5 um, post diameter ~300 nm) (SLIPS 2) 
and an array of epoxy posts of geometry 2 (pitch ~900 nm, height ~500 nm- 
2 um, post diameter ~300 nm) (SLIPS 3). Error bars indicate standard 
deviations from three independent measurements. c, A plot showing the high 
pressure stability of SLIPS, as evident from the low sliding angle of a decane 
droplet (y, = 23.6 + 0.1mNm_ |, volume ~3 pl) subjected to pressurized 
nitrogen gas in a pressure chamber (Supplementary Methods, Supplementary 
Movie 1). Error bars indicate standard deviations from at least seven 
independent measurements. 
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Figure 3 | Self-healing and optical transparency of SLIPS. a, Time-lapse 
images showing the capability of a SLIPS to self-heal from physical damage 
~50 tum wide on a timescale of the order of 100 ms. b, Time-lapse images 
showing the restoration of liquid repellency of a SLIPS after physical damage, as 
compared to a typical hydrophobic flat surface (coated with DuPont Teflon AF 
amorphous fluoropolymers) on which oil remains pinned at the damage site 
(Supplementary Movie 2). c, Optical images showing enhanced optical 
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transparency of an epoxy-resin-based SLIPS (left) as compared to significant 
scattering in the non-infused superhydrophobic nanostructured surface (right) 
in the visible light range. Top panels show top views; bottom panels show 
schematic side views. d, Optical transmission measurements for an epoxy- 
resin-based SLIPS in the visible light range (400-750 nm). e, Optical 
transmission measurements for a Teflon-based SLIPS in the near-infrared 
range (800-2,300 nm). 
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Figure 4 | Repellency of complex fluids, ice and insects by SLIPS. 

a, Movement of light crude oil on a substrate composed of a SLIPS, a 
superhydrophobic Teflon porous membrane, and a flat hydrophobic surface. 
Note the slow movement on and staining of the latter two regions 
(Supplementary Movie 3). b, Comparison of the ability to repel blood bya SLIPS, 
a superhydrophobic Teflon porous membrane, and a flat hydrophilic glass 
surface. Note the slow movement on and staining of the latter two regions 
(Supplementary Movie 4). c, Ice mobility on a SLIPS (highlighted in green) 
compared to strong adhesion to an epoxy-resin-based nanostructured 
superhydrophobic surface (highlighted in yellow, see also Supplementary Movie 
5). The experiments were performed outdoors (note the snow in the 
background) when temperature and relative humidity were -4 °C and ~45%, 
respectively. Note also the reduced frosting and the resulting transparency of the 
SLIPS. d, Demonstration of the inability of a carpenter ant to hold on to SLIPS. 
The ant (and a drop of fruit jam it is attracted to) slide along the SLIPS when the 
surface is tilted (Supplementary Movie 6). Note that the ant can stably attach to 
normal flat hydrophobic surfaces, such as Teflon. All scale bars represent 10 mm. 


Movie 3) and blood (Fig. 4b, Supplementary Movie 4), that rapidly wet 
and stain most existing surfaces. SLIPS also repel ice (Fig. 4c, 
Supplementary Movie 5) and can serve as anti-sticking, slippery sur- 
faces for insects (Fig. 4d, Supplementary Movie 6)—a direct mimicry 
of pitcher plants. The omniphobic nature of our SLIPS also helps to 
protect the surface from a wide range of particulate contaminants by 
allowing self-cleaning by a broad assortment of fluids that collect and 
remove the particles from the surface (Supplementary Fig. 7 and 
Supplementary Movie 7). Any of these capabilities could be com- 
promised over time if the lubricant evaporates or is lost owing to 
shearing under high flow conditions, so choosing a lubricant with a 
minimal evaporation rate or an enhanced viscosity, or integrating the 
SLIPS with a fluid reservoir that enables continual self-replenishing 
(Supplementary Fig. 8), enables prolonged operation. 

No synthetic surface reported until now possesses all the unique 
characteristics of SLIPS: negligible contact angle hysteresis for low- 
surface-tension liquids and their complex mixtures, low sliding angles, 
instantaneous and repeatable self-healing, extreme pressure stability 
and optical transparency. Our bioinspired SLIPS, which are prepared 
simply by infiltrating low-surface-energy porous solids with lubricating 
liquids, provide a straightforward and versatile solution for liquid repel- 
lency and resistance to fouling. Because low-surface-energy porous 
solids are abundant and commercially available, and the structural 
details are irrelevant to the resulting performance, one can turn any 
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of these solids into highly omniphobic surfaces without the need to 
access expensive fabrication facilities. Any liquid film is inherently 
smooth, self-healing and pressure resistant, so the lubricant can be 
chosen to be either biocompatible, index-matched with the substrate, 
optimized for extreme temperatures, or otherwise suitable for specific 
applications. With a broad variety of commercially available lubricants 
that possess a range of physical and chemical properties, we are cur- 
rently exploring the limits of the performance of SLIPS for long-term 
operation and under extreme conditions, such as high flow, turbulence, 
and high- or low-temperature environments. It is anticipated that 
SLIPS can be developed to serve as omniphobic materials capable of 
meeting emerging needs in biomedical fluid handling, fuel transport, 
anti-fouling, anti-icing, self-cleaning windows and optical devices, and 
many more areas that are beyond the reach of current technologies. 


METHODS SUMMARY 


The lubricating fluids used for the experiments were perfluorinated fluids (such as 
3M Fluorinert FC-70, DuPont Krytox 100 and 103). Two types of porous solids 
were used in the experiments, periodically ordered epoxy-resin-based nanostruc- 
tured surfaces and a random network of Teflon nanofibrous membranes. 
Specifically, Teflon membranes with average pore size of =200 nm and thickness 
of ~60-80 jum were purchased from the Sterlitech Corporation. These membranes 
were used as received without further modification (SLIPS 1 sample). The epoxy- 
resin-based nanostructured surfaces were made from silicon masters through the 
replica moulding method”. The resulting dimensions of the nanostructures in the 
epoxy replica were: diameter ~300 nm, height ~5 tm, pitch ~2 1m for the SLIPS 2 
sample, and diameter ~300 nm, height ~500 nm-2 um, pitch ~900 nm for the 
SLIPS 3 sample. The epoxy replicas were further rendered hydrophobic by putting 
the samples in a vacuum desiccator overnight with a glass vial containing 0.2 ml 
heptadecafluoro-1,1,2,2-tetrahydrodecyltrichlorosilane (available from Gelest 
Inc.). To prepare the SLIPS, lubricating fluid was added onto the porous solids 
to form an over-coated layer. With matching surface chemistry and roughness, the 
fluid will spread spontaneously onto the whole substrate through capillary wicking. 
The thickness of the over-coated layer can be controlled by the fluid volume given a 
known surface area of the sample. Further details of the methods are available in the 
Supplementary Information. 


Received 8 June; accepted 11 August 2011. 


1. Quéré, D. Wetting and roughness. Annu. Rev. Mater. Res. 38, 71-99 (2008). 

2.  Barthlott, W. & Neinhuis, C. Purity of the sacred lotus, or escape from 
contamination in biological surfaces. Planta 202, 1-8 (1997). 

3. Gao, X. F. & Jiang, L. Water-repellent legs of water striders. Nature 432, 36 (2004). 

4. Hansen, W. R. & Autumn, K. Evidence for self-cleaning in gecko setae. Proc. Natl 
Acad. Sci. USA 102, 385-389 (2005). 

5. Gao, X. F. etal. The dry-style antifogging properties of mosquito compound eyes 
and artificial analogues prepared by soft lithography. Adv. Mater. 19, 2213-2217 
(2007). 

6. Epstein, A. K., Pokroy, B., Seminara, A. & Aizenberg, J. Bacterial biofilm shows 

persistent resistance to liquid wetting and gas penetration. Proc. Nat! Acad. Sci. 

USA 108, 995-1000 (2011). 

Quéré, D. Non-sticking drops. Rep. Prog. Phys. 68, 2495-2532 (2005). 

Tuteja, A. et al. Designing superoleophobic surfaces. Science 318, 1618-1622 

(2007). 

Tuteja, A., Choi, W., Mabry, J. M., McKinley, G. H. & Cohen, R. E. Robust omniphobic 

surfaces. Proc. Nat! Acad. Sci. USA 105, 18200-18205 (2008). 

10. Nguyen, T. P. N., Brunet, P., Coffinier, Y. & Boukherroub, R. Quantitative testing of 
robustness on superomniphobic surfaces by drop impact. Langmuir 26, 
18369-18373 (2010). 

11. Bocquet, L. & Lauga, E. A smooth future? Nature Mater. 10, 334-337 (2011). 

12. Poetes, R., Holtzmann, K., Franze, K. & Steiner, U. Metastable underwater 
superhydrophobicity. Phys. Rev. Lett 105, 166104 (2010). 

13. Bohn, H. F. & Federle, W. Insect aquaplaning: Nepenthes pitcher plants capture 
prey with the peristome, a fully wettable water-lubricated anisotropic surface. Proc. 
Natl Acad. Sci. USA 101, 14138-14143 (2004). 

14. Ahuja, A. et al. Nanonails: a simple geometrical approach to electrically tunable 
superlyophobic surfaces. Langmuir 24, 9-14 (2008). 

15. Li, Y.,Li,L.& Sun, J. G. Bioinspired self-healing superhydrophobic coatings. Angew. 
Chem. Int. Ed. Engl. 49, 6129-6133 (2010). 

16. Lee, C. & Kim, C. J. Underwater restoration and retention of gases on 
superhydrophobic surfaces for drag reduction. Phys. Rev. Lett. 106, 014502 (2011). 

17. Cassie, A. B. D. & Baxter, S. Wettability of porous surfaces. Trans. Faraday Soc. 40, 
0546-0550 (1944). 

18. Cassie, A. B. D. & Baxter, S. Large contact angles of plant and animal surfaces. 
Nature 155, 21-22 (1945). 

19. Shafrin, E. G. & Zisman, W. A. Constitutive relations in the wetting of low energy 
surfaces and the theory of the retraction method of preparing monolayers. J. Phys. 
Chem. 64, 519-524 (1960). 


jo ON 


©2011 Macmillan Publishers Limited. All rights reserved 


20. 


21. 


22. 


23. 


24. 


25: 


26. 


27. 


28. 


29. 
30. 


Bauer, U. & Federle, W. The insect-trapping rim of Nepenthes pitchers: surface 
structure and function. Plant Signal. Behav. 4, 1019-1023 (2009). 

Federle, W., Riehle, M., Curtis, A. S. G. & Full, R. J. An integrative study of insect 
adhesion: mechanics and wet adhesion of pretarsal pads in ants. Integr. Comp. 
Biol. 42, 1100-1106 (2002). 
Wenzel, R. N. Resistance of solid surfaces to wetting by water. Ind. Eng. Chem. 28, 
988-994 (1936). 
Courbin, L. et al. Imbibition by polygonal spreading on microdecorated surfaces. 
Nature Mater. 6, 661-664 (2007). 
de Gennes, P.-G., Brochard-Wyart, F. & Quéré, D. Capillarity and Wetting 
Phenomena: Drops, Bubbles, Pearls, Waves 15-18 (Springer, 2003). 
Pokroy, B., Epstein, A. K., Persson-Gulda, M. C. M. & Aizenberg, J. Fabrication of 
bioinspired actuated nanostructures with arbitrary geometry and stiffness. Adv. 
Mater. 21, 463-469 (2009). 
Chen, W. et al. Ultrahydrophobic and ultralyophobic surfaces: some comments 
and examples. Langmuir 15, 3395-3399 (1999). 
Delmas, M., Monthioux, M. & Ondarcuhu, T. Contact angle hysteresis at the 
nanometer scale. Phys. Rev. Lett. 106, 136102 (2011). 
Furmidge, C. G. Studies at phase interfaces. 1. Sliding of liquid drops on solid 
surfaces and a theory for spray retention. J. Colloid Sci. 17, 309-324 (1962). 
shino, C., Reyssat, M., Reyssat, E., Okumura, K. & Quéré, D. Wicking within forests of 
micropillars. Europhys. Lett. 79, 56005 (2007). 

akajima, A., Fujishima, A., Hashimoto, K. & Watanabe, T. Preparation of 
ransparent superhydrophobic boehmite and silica films by sublimation of 
aluminum acetylacetonate. Adv. Mater. 11, 1365-1368 (1999). 


LETTER 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements T.-S.W. acknowledges funding support from the Croucher 
Foundation Postdoctoral Fellowship. We thank K. E. Martin for help with the drop 


impact test. We also thank J. C. Weaver and P. Allen for help in manuscript preparation. 


The work was supported partially by the AFOSR MURI award 
FA9550-09-1-0669-DOD35CAP (optical properties), and ARO MURI award 
W911NF-09-1-0476 (robustness and self-repair). We acknowledge the use of the 
facilities at the Harvard Center for Nanoscale Systems supported by the NSF under 
award ECS-0335765. 


Author Contributions T.-S.W. and J.A. conceived the research. J.A. supervised the 
research. T.-S.W., S.H.K. and S.K.Y.T. designed the experiments. T.-S.W. carried out 
surface wettability characterizations. S.H.K. prepared samples and conducted data 


analysis. T.-S.W., S.H.K. and S.K.Y.T. carried out surface morphology characterizations. 
T.-S.W. and S.H.K. carried out drop impact tests and ice experiments. EJ.S. and T.-S.W. 


carried out the high pressure and optical transmission measurements. B.D.H. and 
T.-S.W. carried out blood compatibility tests. T.-S.W., S.H.K., A.G. and J.A. wrote the 
manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to J.A. (jaiz@seas.harvard.edu). 


22 SEPTEMBER 2011]! VOL 477 | NATURE | 447 


©2011 Macmillan Publishers Limited. All rights reserved 


| sid ial Be 


doi:10.1038/nature10327 


Widespread iron-rich conditions in the 


mid-Proterozoic ocean 


Noah J. Planavsky', Peter McGoldrick’, Clinton T. Scott!, Chao Lil?) Christopher T. Reinhard!, Amy E. Kelly’, Xuelei Chu’, 


Andrey Bekker®, Gordon D. Love! & Timothy W. Lyons’ 


The chemical composition of the ocean changed markedly with the 
oxidation of the Earth’s surface’, and this process has profoundly 
influenced the evolutionary and ecological history of life**. The early 
Earth was characterized by a reducing ocean-atmosphere system, 
whereas the Phanerozoic eon (less than 542 million years ago) is 
known for a stable and oxygenated biosphere conducive to the radi- 
ation of animals. The redox characteristics of surface environments 
during Earth’s middle age (1.8-1 billion years ago) are less well 
known, but it is generally assumed that the mid-Proterozoic was 
home toa globally sulphidic (euxinic) deep ocean”’. Here we present 
iron data from a suite of mid-Proterozoic marine mudstones. 
Contrary to the popular model, our results indicate that ferruginous 
(anoxic and Fe*‘-rich) conditions were both spatially and tem- 
porally extensive across diverse palaeogeographic settings in the 
mid-Proterozoic ocean, inviting new models for the temporal dis- 
tribution of iron formations and the availability of bioessential trace 
elements during a critical window for eukaryotic evolution. 

It is well established that Earth evolved from having an early anoxic 
ocean devoid of eukaryotes to one that is fully oxygenated and teeming 
with complex life. However, the timing and mechanisms of Earth’s 
redox evolution are still debated. Foremost, marine redox conditions 
and atmospheric oxygen levels remain poorly constrained during the 
period between the Earth’s oxygen-deficient early history (more than 
~2.Abillion years (Gyr) ago) and the dominantly oxygenated realm of 
the Phanerozoic (the last ~0.542 Gyr). Traditional arguments held 
that ocean oxygenation was responsible for the disappearance of large 
iron formations at 1.8 Gyr ago (ref.1). More recently, the majority 
opinion among Precambrian workers has instead favoured a deep 
mid-Proterozoic ocean with a vast or perhaps even global reservoir 
of hydrogen sulphide*®, and H2S, much like oxygen, would have 
titrated the dissolved iron needed for the deposition of iron forma- 
tions. It is further proposed that these euxinic (anoxic and sulphidic) 
conditions would have hindered the expansion and diversification of 
eukaryotes, because of the insolubility of bioessential trace elements, 
such as molybdenum, in sulphidic waters’. Consistent with a shift to 
euxinia, well-preserved sedimentary rocks from the Animikie basin on 
the Superior craton were suggested to capture the transition to a global 
sulphidic ocean® at ~1.8 Gyr ago. It is now apparent, however, that 
large iron formations were deposited tens of millions of years after the 
deposition of this sedimentary succession*” and that iron-rich condi- 
tions persisted in deep waters in the Animikie basin even after the 
deposition of the largest Animikie iron formations*”°, demanding that 
we rethink the spatiotemporal details of Proterozoic ocean redox and 
specifically the character of the mid-Proterozoic ocean (1.8-1.0 Gyr 
ago)’®. 

In contrast with endmember euxinic or oxic Proterozoic deep-ocean 
models, a third possibility has recently been proposed: that anoxic and 
iron-rich deepwater conditions may have been common throughout 
the Precambrian, including the mid-Proterozoic**"°". This surprising 


view of ocean evolution finds its origins in part with recent evidence that 
the ocean was ferruginous in the terminal Proterozoic'”""’, suggesting 
continuity with the iron-formation-favouring conditions present 
before 1.8 Gyr ago. Alternatively, researchers have also asserted that 
the Neoproterozoic was instead a special case—marked by a return to 
the iron-rich state of the early Precambrian as a consequence of super- 
continent break-up", extensive glaciations’*, and drawdown of marine 
sulphate caused by a billion years of deepwater euxinia and pyrite 
burial’. Although tantalizing, the ferruginous mid-Proterozoic model 
is currently hindered by a billion-year gap in direct evidence from the 
geological record for this marine redox state. Our study fills that data 
gap with results from four diverse mid-Proterozoic depositional settings 
that all point to iron-rich marine waters. Included are samples from the 
McArthur basin in north-central Australia—the only basin so far that 
has yielded direct evidence for mid-Proterozoic euxinia’®’”. 

To evaluate ancient redox chemistry, we have applied a well- 
established sequential iron extraction scheme to fine-grained sedi- 
mentary rocks'*. The accumulation of biogeochemically reactive iron, 
termed ‘highly reactive iron’ (Fey), is linked to the redox conditions 
in the water column overlying the site of sedimentary deposition. In 
modern oxic marine sediments, Feyp comprises less than 38% of 
the total sedimentary iron pool (that is, Feyp/Fer < 0.38), reflecting 
the detrital sediment flux in the absence of dissolved iron in the O,- 
containing water column. Enrichments beyond this limit (Feyp/ 
Fey > 0.38) are a clear signature of transport, scavenging and deposi- 
tion of additional iron from an anoxic water column*’*. Because 
mineralogical changes associated with even moderate burial alteration 
(such as iron uptake into secondary silicate minerals) can decrease the 
pool of Feyp, the upper limit is possibly lower than 0.38 in older 
rocks’, suggesting that essentially all of our samples could have 
formed under anoxic conditions (Fig. 1). Where anoxia is indicated, 
we can further distinguish between ferruginous (Fe?* >H,S) and 
euxinic (H,S > Fe”*) environments by measuring the extent to which 
Feyp has reacted with H2S to form pyrite (Fepy/Feyr). Accordingly, 
anoxic shales with Fep,/Feyp > 0.8 are considered to have been 
deposited under euxinic conditions”. 

Because the McArthur basin has had a defining role in previous 
arguments for mid-Proterozoic euxinia’®’””’', we began our search 
for ferruginous conditions with an additional analysis of fresh drill 
cores of shale from deep-water settings in this region. We specifically 
investigated the iron chemistry of the ~1.64-Gyr-old Barney Creek 
and Lady Loretta formations in the McArthur and Mount Isa basins, 
respectively. Our samples are from geographically widespread marine 
sequences that extend over more than 2,000km across northern 
Australia. We included locations with palaeogeographic positions 
closer to the open ocean compared with past studies in the region that 
also focused on marine palaeoredox. In addition, we targeted the 
deepest-water facies as delineated in previous detailed basin analysis 
(see, for example, ref. 22). 
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Figure 1 | Iron speciation and sulphur isotope data for mid-Proterozoic 
shales. Data from the 1.64-Gyr-old Mt Isa Superbasin (black diamonds and 
bars), the 1.7-Gyr-old Chuanlinggou Formation (white squares and bars), the 
1.45-Gyr-old Newland Formation (grey triangles and bars) and the 1.2-Gyr-old 
Borden basin (grey circles). a, The vast majority of our samples have ratios of 
highly reactive to total iron (Feyp/Fey) and of pyrite to highly reactive iron 
(Fepy/Fepp) falling above 0.15-0.38 and below 0.7-0.8, respectively, which is 
diagnostic of sediment accumulation beneath an anoxic and iron-rich (non- 
sulphidic) water column. b, Pyrite 5°S isotope values (5**Sp,) relative to 
Vienna Canyon Diablo Troilite (VCDT). Estimates for sulphate 5°45 values are 
from refs 32, 33. 


FeyR/Fer values in both the Barney Creek and Lady Loretta forma- 
tions are generally above 0.38, conservatively indicating deposition 
under anoxic conditions (Fig. 1a). The vast majority of these samples 
have Fepy/Feyp ratios well below 0.8, which is consistent with a per- 
sistently sulphide-free water column. Together, these ratios point to 
widespread ferruginous conditions over thick (hundreds of metres) 
stratigraphic intervals, indicating prolonged periods of ferruginous 
deep waters, with the likelihood of laterally contemporaneous occur- 
rences of euxina'®’” in certain small or isolated sub-basins and/or on 
the shallower margins. Previous regional studies have argued for a 
relatively strong marine connection during deposition at our specific 
sample locations (see Supplementary Information), suggesting that 
deep ocean waters enriched in dissolved Fe** may have exchanged 
with the McArthur and Mt Isa basins. 

Given these exciting results, we were obliged to look beyond this 
region for records of mid-Proterozoic ferruginous waters. With this goal, 
we analysed additional suites of carbonaceous shales from other, widely 
distributed mid-Proterozoic basins, emphasizing well-preserved (sub- 
greenschist) shales from diverse palaeogeographic settings spanning the 
mid-Proterozoic. Each of these additional units yielded abundant 
samples with Feyp/Fey > 0.38 and Fep,/Feyr < 0.8 (Fig. 1a), signify- 
ing widespread ferruginous depositional conditions. Our data include 
samples from the 1.7-Gyr-old Chuanlinggou Formation in northern 
China, the 1.45-Gyr-old Belt Supergroup in the north-central USA and 
the 1.2-Gyr-old Borden basin in Arctic Canada. The Chuanlinggou 
Formation is interpreted as being a passive-margin sequence, suggest- 
ing a strong connection to the open ocean, and the Borden basin was a 
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passive margin that evolved into a foredeep setting. In contrast, the Belt 
basin probably represents an extensional marine setting with transi- 
ently more restricted depositional conditions (see Supplementary 
Information). 

A small subset of samples from the MtIsa superbasin, the Belt 
Supergroup and the Borden basin have significant iron enrichments 
and Fep,/Feyp near 0.8 (Fig. 1a), suggesting that sulphidic conditions 
may have developed episodically in the water column. The lack of 
persistently euxinic conditions in the Belt Supergroup is surprising. 
Asa semi-isolated, probably marginal marine system with evidence for 
high rates of primary productivity”, the Belt basin would seem ideally 
suited to developing euxinia—as we see in the modern, restricted Black 
Sea. Clear fingerprints of ferruginous conditions in the Chuanlinggou 
Formation are also revealing: as a passive-margin sequence lacking 
indications of appreciable basin restriction, this setting provides one 
of our best windows on conditions in the open Proterozoic ocean. 

Our finding of iron-rich conditions in several mid-Proterozoic 
marine settings contrasts with the widely accepted view of globally per- 
sistent and pervasive deep euxinia. However, this discovery is entirely 
consistent with an emerging view of Precambrian ocean chemistry 
brought to light by the most recent trace-metal and iron speciation studies 
from younger and older portions of the Precambrian ocean. Specifically, 
there is evidence for coexisting iron-rich and H)S-rich conditions in 
several Neoarchaean*”’, early and middle Palaeoproterozoic'** and 
early and late Neoproterozoic settings'** (Fig. 2). Ferruginous conditions 
were apparently widespread in the deeper portions of the ocean, whereas 
sulphide was probably limited to highly productive regions along the 
continental margins'*'*’*”*, which is analogous to the more reducing 
conditions in modern oxygen minimum zones. Our data fill a billion- 
year gap in the evidence for this marine redox state, indicating a hitherto 
undocumented continuity of iron-rich conditions throughout the 
Precambrian. 

Our finding of extensive ferruginous conditions is also consistent 
with a recent study of marine molybdenum inventories”, which 
argued that the extent of euxinic depositional environments during 
the mid-Proterozoic could have been severalfold that of the modern 
ocean (<1%) but far from whole-ocean euxinia. Similarly, mid- 
Proterozoic Mo isotope data are easily explained through greatly 
expanded (relative to today) but still largely local euxinia, with deep 
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Figure 2 | Summary of marine chemical conditions in the Precambrian. 

a, Estimates of atmospheric oxygen compared with present atmospheric level 
(PAL). b, Classical models of the chemical composition of the deep ocean. 

c, Distribution of Precambrian euxinic and ferruginous deep waters, based on 
the shale record. Our study provides evidence for extensively developed and 
likely persistent ferruginous conditions in the deep ocean during the mid- 
Proterozoic, which was previously thought to have been characterized by either 
oxygenated or sulphide-rich conditions. The emerging view based on redox 
studies of marine shales is that during the mid-Proterozoic, when there were 
relatively low levels of atmospheric oxygen, both euxinic and ferruginous 
waters were common, and often stratified, below the oxygenated surface- 
mixing zone. In the Phanerozoic, with higher levels of atmospheric oxygen, the 
deep oceans were anoxic for only short periods (see the text for details). 
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waters that were dominantly ferruginous and thus less efficient at 
burying Mo (see, for example, ref. 20). 

We argue that the flux of organic matter was central to controlling 
the redox landscape of the mid-Proterozoic ocean, as has been sug- 
gested for other—both older and younger—instances of Precambrian 
euxinia’®'**°, Estimates for dissolved sulphate levels in the mid- 
Proterozoic ocean range from 500 to 3,000 [1M (see Supplementary 
Information). Even the lower estimate for sulphate is well above the 
upper limit for dissolved iron, which is fixed at roughly 100 uM by the 
solubilities of iron carbonates and silicates’. Therefore, ferruginous 
marine conditions must instead mirror limited sulphide production”. 
Sulphide is produced anaerobically by bacteria at the expense of 
organic matter. It follows that spatial gradients in the organic flux 
and, ultimately, organic productivity may have limited the extent of 
euxinia. 

Consistent with an organic matter delivery control on the distri- 
bution of sulphidic marine conditions, the analysed samples contain 
substantially less organic carbon than do typical euxinic Precambrian 
and Phanerozoic shales. Samples in our study contain on average less 
than 1% organic carbon, which is severalfold lower than concentra- 
tions common in euxinic shales (see, for example, ref. 27). Low levels of 
organic matter in ferruginous shales suggest relatively low productivity 
in the overlying water column. In addition, there is a sulphur isotope 
signal consistent with bacterial sulphate reduction occurring predomi- 
nately in the porewaters. Pyrite in our samples has 5°*S values that are 
slightly lower than or equivalent to coeval sulphate (Fig. 1b). A simple 
explanation for these results is that bacterial sulphate reduction is 
occurring largely in sediments where potentially high isotopic fractio- 
nations are muted by limited sulphate availability. Sulphate supplies in 
the sediments would be controlled by rates of diffusional replenish- 
ment, and associated deficiencies would be exacerbated by the com- 
paratively small amount of sulphate in mid-Proterozoic seawater. In 
other words, limited availability of organic matter probably caused the 
onset of appreciable bacterial sulphate reduction to be restricted to the 
sediments. However, these sulphur isotope results do not completely 
exclude water column sulphur cycling. 

For ferruginous conditions to have been extensive in the mid- 
Proterozoic ocean, dissolved oxygen acquired in surface waters 
through photosynthesis and gas exchange with the overlying atmo- 
sphere must have been consumed as deep water masses aged. Oxygen 
will be consumed through the degradation of sinking organic matter 
and, if oxygen remains available at depth, by hydrothermally sourced 
reductants (see, for example, ref. 29). Our results indicate that the flux 
of Fe** into deep waters typically exceeded rates of sulphide genera- 
tion in all but nearshore or restricted regions with relatively high rates 
of primary productivity that fuelled localized sulphate reduction in the 
water column. 

Our results also call for a reconsideration of the factors controlling 
the temporal distribution of large iron formations. In contrast with the 
canonical view, in which iron formations disappeared as the deep 
ocean evolved from iron to oxygen or sulphide domination’, the long 
persistence of ferruginous conditions in our model argues that iron 
formations are anomalous sedimentary deposits linked in most cases 
to an enhanced iron supply by means of strong hydrothermal inputs’. 
Consistent with our ocean model, the amount of hydrothermal iron 
released to the oceans has varied greatly with marine sulphate con- 
centrations®’ and mantle plume activity as reflected by dyke swarms 
and large igneous provinces’. 

Our findings cast a new perspective on mid-Proterozoic environ- 
mental conditions, ecology and evolution. For example, evidence for 
extensive ferruginous conditions throughout the Proterozoic ocean 
provides a simple answer to the apparent conundrum of increasing 
enzymatic use of iron, molybdenum and cobalt during the mid- 
Proterozoic as inferred by a recent study of the evolution of almost 
4,000 gene families*’. It is possible for these bioessential metals to have 
been readily available in an ocean with pervasively ferruginous deep 
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waters, in contrast with the certainty of biolimitation if deep waters 
were globally sulphidic’. Free sulphide in the water column greatly 
decreases the solubility of these elements. It remains to be tested, 
however, whether broad, but far from global, extents of euxinia in a 
stratified ocean were still able to pull down trace metal inventories at 
least locally to biologically critical levels, as suggested in previous work”. 
More generally, our data now provide the foundation for a unified model 
for the chemical evolution of the Precambrian ocean consistent with 
diverse redox tracers and bridging past work bracketing the mid- 
Proterozoic. Recognizing the spatial and temporal heterogeneity 
expected in a dynamic early ocean, we propose the almost continuous 
coexistence of sulphide-rich and iron-rich conditions for billions of years 
beneath oxic surface waters as the backdrop for Precambrian biological 
evolution, and specifically the protracted radiation of eukaryotes and the 
ultimate rise of animals. 


METHODS SUMMARY 


Iron speciations were performed at the University of California, Riverside (UCR), 
using a well-calibrated sequential extraction protocol designed to quantify the 
different pools of Feyp (ref. 18). A small portion of sample powder (~100 mg) 
was used for the extractions, and iron concentrations were determined with an 
Agilent 7500ce inductively coupled plasma mass spectrometry (ICP-MS) at UCR. 
Fep, was calculated on the basis of the weight percentage of sulphur extracted 
during a 2-h hot chromous chloride distillation followed by iodometric titration. 
Total iron concentrations where determined by one of two methods: X-ray fluor- 
escence at the CODES Research Centre at the University of Tasmania, or a three- 
acid digest and ICP-MS analysis at UCR. Sulphur isotope measurements were 
made at UCR with a ThermoFinnigan Delta V continuous-flow stable-isotope- 
ratio mass spectrometer after a chromous chloride distillation, where the pyrite-S 
was reprecipitated as Ag,S. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


The extraction method used in this study to speciate between the reactive iron 
pools in fine-grained siliciclastic rocks and sediments has been described in detail 
elsewhere, and we therefore provide only an overview here. In short, our iron 
speciations were performed at UCR, using a well-calibrated sequential extraction 
protocol designed to quantify the different pools of Feyp (refs 6,18). Feyp is 
subdivided into three subpools, each with the potential to react with hydrogen 
sulphide on diagenetic timescales: carbonate-associated iron extracted with a 
sodium acetate solution (Fec,,), ferric oxides extracted with a dithionite solution 
(Feox), and mixed-valence iron oxides, principally magnetite, extracted with 
ammonium oxalate (Feyag). We used ~100mg of sample powder, and the 
sequential extracts were analysed with an Agilent 7500ce ICP-MS. Pyrite (Fepy) 
is also included in the Fey;z pool. Fepy was calculated (assuming a stoichiometry of 
FeS,) on the basis of the weight percentage of sulphur extracted during a 2-h hot 
chromous chloride distillation followed by iodometric titration. The assumption 
of a FeS, stoichiometry in the sulphide pool was tested through extensive extrac- 
tions for acid-volatile sulphide with hot SnCl,-HCl (15% SnCl, 6 M HC\) for 1h. 


The samples included here all contain less than 0.1% sulphur extractable by HCl. 
Total iron concentrations where determined by one of two methods: X-ray fluor- 
escence at the CODES Research Centre at the University of Tasmania, or a three- 
acid digestion followed by ICP-MS analysis at UCR. On the basis of duplicate 
analyses and Geostandard monitoring, reproducibility of iron measurements was 
better than 5%. However, samples with less than 0.1% iron were found to be 
reproducible to two decimal places, but the error can exceed 5%. At such low 
levels of iron, these errors have no impact on our conclusions. 

We determined concentrations of total organic carbon by taking the difference 
between carbonate carbon liberated by 4M HCl and total carbon released by 
combustion at 1,450 °C, both of which were measured with an ELTRA C/S deter- 
minator at UCR. Last, also at UCR, pyrite-S was extracted for isotope measure- 
ments by using the same chromous chloride distillation but, in this case, 
reprecipitating the pyrite-S as AgoS. Suphur isotope measurements were made 
with a ThermoFinnigan Delta V continuous-flow stable-isotope-ratio mass 
spectrometer. Reproducibility was better than 0.2%o on the basis of single-run 
and long-term standard monitoring. 
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Phylogenomics reveals deep molluscan relationships 
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Evolutionary relationships among the eight major lineages of 
Mollusca have remained unresolved despite their diversity and 
importance. Previous investigations of molluscan phylogeny, based 
primarily on nuclear ribosomal gene sequences'* or morphological 
data*, have been unsuccessful at elucidating these relationships. 
Recently, phylogenomic studies using dozens to hundreds of genes 
have greatly improved our understanding of deep animal relation- 
ships’. However, limited genomic resources spanning molluscan 
diversity has prevented use of a phylogenomic approach. Here we 
use transcriptome and genome data from all major lineages (except 
Monoplacophora) and recover a well-supported topology for 
Mollusca. Our results strongly support the Aculifera hypothesis 
placing Polyplacophora (chitons) in a clade with a monophyletic 
Aplacophora (worm-like molluscs). Additionally, within Conchifera, 
a sister-taxon relationship between Gastropoda and Bivalvia is sup- 
ported. This grouping has received little consideration and contains 
most (>95%) molluscan species. Thus we propose the node-based 
name Pleistomollusca. In light of these results, we examined the evolu- 
tion of morphological characters and found support for advanced 
cephalization and shells as possibly having multiple origins within 
Mollusca. 

With over 100,000 described extant species in eight major lineages, 
Mollusca is the second most speciose animal phylum®. Many molluscs 
are economically important as food and producers of pearls and shells 
whereas others cause economic damage as pests, biofoulers and invasive 
species. Molluscs are also biomedically important as models for the 
study of brain organization, learning and memory as well as vectors 
of parasites. Although shelled molluscs have one of the best fossil 
records of any animal group, evolutionary relationships among major 
molluscan lineages have been elusive. 

Morphological disparity among the major lineages of Mollusca has 
prompted numerous conflicting phylogenetic hypotheses (Fig. 1). The 
vermiform Chaetodermomorpha (also known as Caudofoveata) and 
Neomeniomorpha (also known as Solenogastres) traditionally have been 
considered to represent the plesiomorphic state of Mollusca because 
of their ‘simple’ internal morphology and lack of shells’. Whether these 
two lineages constitute a monophyletic group, Aplacophora®, or a 
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d Monoplacophora 


Bivalvia 


b Conchifera 
Polyplacophora 
Chaetodermomorpha 
Neomeniomorpha 


paraphyletic grade*” has been widely debated. Some workers have con- 
sidered the presence of sclerites a synapomorphy for a clade Aculifera, 
uniting Polyplacophora (chitons; which have both sclerites and 
shells) and Aplacophora. In contrast, Polyplacophora has alternatively 
been placed with Conchifera (Bivalvia, Cephalopoda, Gastropoda, 
Monoplacophora and Scaphopoda) in a clade called Testaria uniting 
the shelled molluscs*. Morphology has been interpreted to divide 
Conchifera into a gastropod/cephalopod clade (Cyrtosoma) and a 
bivalve/scaphopod clade (Diasoma)°. Unfortunately, because of varying 
interpretations of features as derived or plesiomorphic, a lack of clear 
synapomorphies, and often unclear character homology, the ability of 
morphology to resolve such deep phylogenetic events is limited. 

Molecular investigations of molluscan phylogeny have relied 
primarily on nuclear ribosomal gene sequences (18S and 28S)**", 
and have also offered little resolution. Maximum likelihood (ML) 
analyses of 18S, 28S or both’ recovered most major lineages mono- 
phyletic, but support at deeper nodes was generally weak. Subsequent 
analyses of a combined data set (18S, 28S, 16S, cytochrome c oxidase I 
and histone H3)’ yielded similar results, namely that bivalves were not 
monophyletic and support values at most deep nodes were low. 
Expanding on this study, further work supported a sister-taxon rela- 
tionship between chitons and monoplacophorans (Serialia) but sup- 
port at other deep nodes was generally low’. Moreover, Mollusca was 
not recovered monophyletic (a result significantly supported by 
Approximately Unbiased, AU, tests; Supplementary Table 1) possibly 
due to contaminated neomenioid sequences”. 

Morphological and traditional molecular phylogenetic approaches 
have failed to robustly reconstruct mollusc phylogeny. Notably, several 
recent phylogenomic studies (for example, refs 5 and 11) have signifi- 
cantly advanced our understanding of metazoan evolution by using 
sequences derived from genome and transcriptome data. With this 
approach, numerous orthologous protein-coding genes can be iden- 
tified and employed in phylogeny reconstruction. Many of these genes 
are constitutively expressed and can be easily recovered from even 
limited expressed sequence tag (EST) surveys. Additionally, these 
genes are usually informative for inferring higher-level phylogeny 
because of their conserved nature due to their functional importance. 


£ Neomeniomorpha 
Chaetodermomorpha | Aculifera 
Polyplacophora 
Conchifera 


Testaria 


= : f Scaphopoda 
Polyplacophora Senalia Scaphopoda Diasoina Cephalopoda 
Neomeniomorpha Gastropoda Gastropoda 
Chaetodermomorpha Cephalopoda Cyrtosoma Bivalvia 
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Figure 1 | Leading hypotheses of molluscan phylogeny. a, Adenopoda 
hypothesis placing Chaetodermomorpha basal. b, Hepagastralia hypothesis 
placing Neomeniomorpha basal. c, Aculifera hypothesis placing Aplacophora 
sister to Polyplacophora. d, Serialia hypothesis allying Polyplacophora and 


Monoplacophora. e, Diasoma and Cyrtosoma hypotheses allying bivalves to 
scaphopods and gastropods to cephalopods, respectively. f, Unnamed 
hypothesis, allying scaphopods and cephalopods. 
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Figure 2 | Relationships among major lineages of Mollusca based on 308 
genes. Bayesian inference topology shown with ML bootstrap support values 
(bs) >50 and posterior probabilities (pp) >0.50 are listed at each node. Filled 


Here, we used such a phylogenomic approach to investigate evolu- 
tionary relationships among the major lineages of Mollusca. High- 
throughput transcriptome data were collected from 18 operational 
taxonomic units (OTUs; Supplementary Table 2), and augmented 
with publicly available ESTs and genomes (Supplementary Table 3). 
To increase data set completeness, data from closely related species 
were combined in eleven cases, resulting in a total of 42 mollusc OTUs. 
Every major lineage of Mollusca was represented in the data set by at 
least two distantly related species, except for monoplacophorans that 
live in deep marine habitats and could not be procured in adequate 
condition for transcriptome analyses. For sequence processing and 
orthology determination, a bioinformatic pipeline was developed that 
builds upon previous studies (see Methods and Supplementary Fig. 2). 
This pipeline identified 308 orthologous genes suitable for concatena- 
tion and phylogenetic analyses (Supplementary Table 4), totalling 
84,614 amino acid positions. 

To determine the appropriate outgroup to Mollusca, preliminary 
analyses including a broad range of lophotrochozoans and the cnidarian 
Nematostella were conducted. Nematostella was included to verify that 
neomenioid data did not contain cnidarian contamination (see 
Methods). Maximum likelihood (ML) analyses using the best-fitting 
model for each gene strongly supported Annelida as the sister taxon of 
Mollusca (bootstrap support, bs = 100, Supplementary Fig. 3), whereas 
Bayesian inference (BI) placed Entoprocta + Cycliophora sister to 
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circles represent nodes with bs = 100 and pp = 1.00. Taxa from which new data 
were collected are shown in bold. 


Mollusca with poor support (posterior probability, pp = 0.62, Sup- 
plementary Fig. 4). Relationships among major lineages of Mollusca 
were consistent between analyses with multiple outgroups (Sup- 
plementary Figs 3-4) or with only Annelida as outgroup (Fig. 2 and 
Supplementary Fig. 5; additional information on outgroup selection in 
Supplementary Results). On the basis of these results, Annelida was 
selected as outgroup for all other analyses to reduce computational 
complexity and potential homoplasy from distant or fast-evolving out- 
groups. This final data matrix including all 308 genes (Fig. 3) had an 
average percentage of genes sampled per taxon of 41% and an overall 
matrix completeness of 25.6%, comparable to other major phyloge- 
nomic data sets (for example, ref 11). ML and BI analyses of this matrix 


100 200 
Genes 


308 


Figure 3 | Data matrix coverage. Genes are ordered along the x-axis from left 
to right from best sampled to worst sampled. Taxa are ordered along the y-axis 
from top to bottom from most genes sampled to fewest genes sampled. Black 
squares represent a sampled gene fragment and white squares represent a 
missing gene fragment. 
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yielded nearly identical topologies within Mollusca, except for relation- 
ships among basal gastropods and placements of the sea slug 
Pleurobranchaea and the bivalve Mytilus (Fig. 2 and Supplementary 
Fig. 5). High leaf stability scores for all OTUs (Supplementary Table 3) 
and strong support for most nodes suggest all OTUs were represented 
by sufficient data to be reliably placed. Remarkably, branch lengths 
were relatively uniform; cephalopods did not show long branches as 
previously reported in analyses of 18S and 28S'*”°. 

All major lineages of Mollusca were monophyletic with strong 
support (bs = 100%, pp = 1.00). Importantly, there was strong 
support at all deep nodes, although the node placing Scaphopoda 
received moderate support in ML (bs = 72%) but strong support in 
BI (pp = 0.98). A clade including Aplacophora and Polyplacophora 
was unequivocally supported (bs = 100%, pp = 1.00) and placed sister 
to Conchifera, consistent with the Aculifera hypothesis. Moreover, we 
found strong support (bs = 100%, pp = 0.99) for a sister relationship 
between Neomeniomorpha and Chaetodermomorpha, supporting the 
Aplacophora hypothesis but contrary to previous molecular’ *”° and 
morphological* studies. To evaluate alternatives to the Aculifera and 
Aplacophora hypotheses, we used AU tests (Supplementary Table 5). 
These tests rejected the Testaria hypothesis, which allies chitons 
with the other shelled molluscs (P< 0.02) and placement of either 
aplacophoran taxon as sister to all other molluscs (both P< 0.01). 
Aculiferan monophyly supports interpretation of the Palaeozoic taxon 
‘Helminthochiton’ thraivensis as possessing features intermediate 
between chitons and aplacophorans”, and interpretation of dorsal, 
serially arranged calcareous structures as a possible aculiferan synapo- 
morphy”’. Specifically, the chaetoderm Chaetoderma’* and some, 
but not all, neomenioids’’ possess dorsal, serially repeated sclerite- 
secreting regions during development. Notably, chiton valves are not 
thought to be homologous to aculiferan sclerites'’, although certain 
genes involved in patterning these structures may be. Our results high- 
light a need for developmental gene expression studies of aculiferans to 
address this issue. 

Within a monophyletic Conchifera (bs = 100%, pp = 0.98), Gastro- 
poda and Bivalvia were supported as derived sister taxa (bs = 100%, 
pp = 1.0). Traditionally, a sister relationship between gastropods and 
bivalves, which relates the two most speciose lineages of molluscs, has 
received little consideration. However, this relationship has been 
recovered in molecular studies with relatively limited taxon sampling 
across Mollusca*’’. Similarities between the veliger larvae of gastro- 
pods and lamellibranch bivalves have been long recognized. Most 
notably, both possess larval retractor muscles and a velum muscle 
ring’®. Another potential synapomorphy is loss of the anterior ciliary 
rootlet in locomotory cilia of gastropods and bivalves'’. Because of 
strong support for a gastropod/bivalve clade in most analyses and 
the implications of this hypothesis for understanding molluscan evolu- 
tion, we propose the node-based name Pleistomollusca, which 
includes the last common ancestor of Gastropoda and Bivalvia and 
all descendents (Fig. 4). Etymology of this name (pleistos from Greek 
for ‘most’) recognizes the incredible species diversity of this clade of 
molluscs which we conservatively estimate to contain >95% of 
described mollusc species. 

Sister to Pleistomollusca is Scaphopoda (albeit with moderate support 
in ML; bs = 72%, pp = 0.98) and Cephalopoda represents the sister 
taxon of all other conchiferan lineages sampled. Despite strong support 
values for a gastropod/bivalve clade, AU tests failed to reject Scaphopoda 
as sister to any other conchiferan lineage (P > 0.5). Given the limited 
sampling for Scaphopoda, additional data may help solidify its position. 
Nonetheless, all results presented here clearly refute the traditional view 
ofa sister relationship between gastropods and cephalopods (Cyrtosoma; 
P<0.01). Features thought to be diagnostic of this clade include a well- 
developed, free head with cerebrally innervated eyes and a nervous 
system with visceral loop inwards of the dorsoventral musculature’. 
However, these characters must be reinterpreted as either symplesio- 
morphies lost in scaphopods and bivalves, or convergences. Notably, 
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Figure 4 | Deep molluscan phylogeny as inferred in the present study. Black 
circles represent nodes with bs = 100 and pp = 1.00. Gray circles represent 
nodes with bs = 100 and pp = 0.98. The actual specimens of Polyschides and 
Hanleya used in this study are shown. Photos are not to scale. A full-page 
version of this figure is presented in Supplementary Fig. 1. 


the high degree of cephalization in gastropods and cephalopods has 
recently been suggested to have evolved independently”. 

The phylogenomic approach used here also holds promise for resolv- 
ing relationships within major lineages. For example, although their 
phylogeny has been widely debated, our broadly sampled caenogastropod 
subtree was strongly supported throughout (bs = 100, pp = 1.0) and 
consistent with previous morphological analysis”’. We also recovered 
opisthobranchs paraphyletic with respect to Pulmonata, agreeing with 
recent morphological and molecular studies”. Additionally, our analyses 
confirm bivalve monophyly with deposit-feeding protobranchs sister to 
filter-feeding lamellibranchs. 

To assess robustness of the reconstructed topology further, we 
examined the influences of matrix completeness, gene inclusion and 
substitution models on phylogenetic reconstruction (Supplementary 
Table 6). Analyses of the 200 and 100 best-sampled genes (Supplemen- 
tary Figs 6 and 7) recovered the same branching order and relative level 
of support among major lineages as the full data set. For gene inclusion, 
matrices of only non-ribosomal (Supplementary Fig. 8) and only ribo- 
somal protein genes (Supplementary Fig. 9) were analysed to address 
issues of different gene classes (for example, ribosomal proteins) bias- 
ing phylogenetic signal°. Support values for deep nodes inferred from 
non-ribosomal protein genes were generally weak and Aplacophora, 
Polyplacophora and Bivalvia were not recovered monophyletic. In 
contrast, analysis of only ribosomal protein genes recovered all major 
lineages monophyletic with strong support in BI but moderate support 
for most deep nodes in ML (see also ref. 17). Although ribosomal 
protein and non-ribosomal protein genes seem to be contributing 
different amounts of phylogenetic signal, support for most nodes 
was greater when all gene classes were included, in accordance with 
previous phylogenomic studies*'!. We also performed an analysis 
based on very conservative orthology determination using only the 
243 genes for which our method and InParanoid identified the same 
Lottia sequence as orthologous to the primer taxon (Drosophila) 
sequence (see Methods). Branching order (Supplementary Fig. 10) 
was identical to the tree based on all 308 genes (Fig. 2). Our ML 
analyses differ from other phylogenomic studies by using gene-specific 
amino acid substitution models rather than a single model across the 
entire matrix. Thus, for comparative reasons, we also ran single-model 
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Inferred plesiomorphic state of Mollusca 


Character Monoplacophora not considered Monoplacophora basal in Conchifera Monoplacophora sister to Polyplacophora 
Shell by shell gland Absent Absent Equivocal 
Periostracum Absent Absent Equivocal 
Position of mantle cavity Equivocal Circumpedal Equivocal 
Number of D-V muscles Equivocal Eight or more Equivocal 
Pedal ganglia Equivocal Absent Equivocal 
Cerebral (pretrochal) eyes Equivocal Absent Equivocal 


Only six of 60 characters were affected by the placement of Monoplacophora. See Supplementary Table 7 for additional characters and coding for all characters. 


MLanalyses using the WAG + CAT + F model (Supplementary Fig. 11) 
and the LG + CAT + F model (Supplementary Fig. 12). These analyses 
yielded the same relationships as the ML analysis using the best-fitting 
model for each gene (Supplementary Fig. 5) with similar overall support 
in all three analyses. We also assessed the effect of model selection by 
performing a BI analysis using the CAT-GTR model on the data set of 
the 100 best-sampled genes (Supplementary Fig. 7); this model is too 
computationally intensive for the full 308 gene data set. Except for the 
placement of Pleurobranchaea, this analysis yielded the same branching 
order as the analysis using the CAT model (Fig. 2) with similar support 
values. Finally, even an approximately ML analysis (Supplementary 
Fig. 13), which is less computationally intensive, yielded the same rela- 
tionships among major lineages as the fully parameterized ML analysis. 

A primary goal of resolving molluscan phylogeny is to improve our 
understanding of their early evolutionary history. Perhaps more than any 
other animal group, understanding of molluscan early evolution has been 
constrained by the notion ofa generalized bauplan or ‘archetype’ which is 
still propagated by some invertebrate zoology textbooks. Arguably, sucha 
viewpoint has hindered our ability to consider how individual characters 
have evolved within Mollusca. Using a modified version of a morpho- 
logical character matrix’, we performed ancestral state reconstruction 
using maximum parsimony and a simplified topology based on our 
results (Fig. 4) to infer ancestral states for 60 characters across 
Mollusca (Supplementary Table 7). Even though monoplacophoran 
transcriptome data were unavailable herein, we were able to evaluate 
how placement of Monoplacophora influences our understanding of 
early molluscan evolution. Ancestral state reconstruction of most char- 
acters for the last common ancestor of Mollusca was unaffected by the 
placement of monoplacophorans. We considered three possibilities: (1) 
Monoplacophora basal within Conchifera, (2) sister to Polyplacophora, 
and (3) absent from the analysis. In all three cases, only 6 out of 60 
characters were influenced (Table 1). For example, ancestral state recon- 
struction for shell(s) secreted by a shell gland and periostracum changed 
between absent (Monoplacophora basal conchiferan) and equivocal 
(Monoplacophora sister to Polyplacophora, or not considered). 

Results of these ancestral state reconstructions shed light on the early 
evolution of Mollusca. Odontogriphus, a Middle Cambrian form proposed 
to be a stem-group mollusc, showed character states consistent with our 
reconstructions (ventral muscular foot, dorsal cuticular mantle, mantle 
cavity containing ctenidia or gills, and regionalized gut)”. However, 
whereas Odontogriphus and Wiwaxia (another Middle Cambrian 
putative stem-group mollusc) apparently had a narrow, distichous 
(bipartite, aplacophoran-like) radula”*”, ancestral state reconstruction 
indicates that the plesiomorphic state of the radula was broad and 
rasping with multiple teeth per row attached to a flexible radular mem- 
brane supported by muscular and cartilage-like bolsters as in chitons 
and most conchiferans. 

The origin and evolution of molluscan epidermal hardparts (shells 
and sclerites) is another contentious issue. Although aculiferan sclerites, 
chiton valves and conchiferan shells are all calcareous secretions of the 
mantle, developmental and structural differences indicate that these 
structures are not homologous’®. Sclerites are only present in aculiferans, 
and shells secreted by a shell gland are only present in conchiferans. 
Moreover, fossil taxa do not help clarify the plesiomorphic state of the 
molluscan scleritome as Odontogriphus lacked both sclerites and shells”, 


Wiwaxia had uncalcified, chitinous sclerites, and other putative stem- 
group molluscs had calcareous sclerites and/or shells’. Therefore, 
organization of the ancestral scleritome, if present, remains ambiguous. 

In summary, our robustly supported evolutionary framework for 
Mollusca consists of two major clades: Aculifera, which includes a mono- 
phyletic Aplacophora sister to Polyplacophora, and Conchifera (as 
sampled), including a gastropod/bivalve clade we term Pleiostomollusca. 
Neomeniomorpha was not placed as the basal-most molluscan lineage 
as previously suggested nor is the Testaria hypothesis supported. Thus, 
several aplacophoran features commonly argued to be molluscan ple- 
siomorphies (for example, non-muscular foot, organization of midgut, 
primarily distichous radula without subradular membrane) are reinter- 
preted as aplacophoran synapomorphies, whereas others are reinter- 
preted as neomenioid apomorphies (for example, prepedal cirri, 
pericalymma-type larva). Within Conchifera, our results show that 
gastropods are sister to bivalves (not cephalopods), a result that has 
important implications for molluscan model systems. Also, possible 
independent evolution of highly cephalized morphologies in gastropods 
and cephalopods suggests additional work addressing neural features 
across conchiferans is needed”. 


METHODS SUMMARY 


RNA was extracted from 20 mollusc species representing 18 OTUs, reverse tran- 
scribed to cDNA, and sequenced using 454 GS-FLX or Titanium (Roche; 
Supplementary Table 2). Sanger expressed sequence tag (EST) libraries generated 
for Scutopus and Wirenia were also included in this study. These data were augmented 
with publicly available data (Supplementary Table 3). ESTs were cleaned, assembled 
and translated using EST2Uni*. Unigenes (contigs and singletons) were parsed into 
putatively orthologous groups (OGs) with HaMStR”**. 

Each OG was aligned and manually evaluated to trim out obviously mis- 
translated regions, screen for paralogues and combine two or more incomplete 
sequences representing the same orthologue into a consensus sequence. For each 
OG, ML trees were inferred in RAxML 7.27 (ref. 27) using the best fitting amino 
acid substitution model. For OGs with apparent paralogues, suspect sequences 
were removed or the OG was excluded from further analysis. Additional filtering 
was used on the neomenioid aplacophoran data sets to identify and remove 
cnidarian contamination (see Methods). 

Phylogenetic analyses of the final matrix were performed using ML with the best 
fitting model for each gene in RAxML and BI with the CAT model in Phylobayes 
2.3 (ref. 28) on the Alabama Supercomputer Authority’s Dense Memory Cluster 
(http://www.asc.edu/). Stability of each OTU was calculated using the leaf stability 
index implemented in Phyutility’? and alternative hypotheses of molluscan rela- 
tionships were evaluated using AU tests with the WAG+I'+F model in 
RAXxML. Ancestral state reconstructions were performed based on a modified 
morphological matrix* using maximum parsimony in Mesquite 2.74 (http://mes- 
quiteproject.org/). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Overview. Data and analyses were conducted in four basic steps: (1) RNA was 
extracted from mollusc species, cDNA was prepared and then sequenced; (2) EST 
data were processed with a bioinformatics pipline incorporating EST2Uni** and 
HaMStR*; (3) trees were reconstructed with RAxML 7.27 (ref. 27) and Phylobayes 
2.3 (ref. 28). (4) Additional measures, including leaf stability with Phyutility” and 
Approximately Unbiased (AU) tests*® were used to assess robustness of the results. 
Molecular techniques. Complementary DNA was prepared using standard pro- 
tocols and sequenced using either 454 GS-FLX or Titanium. Sanger EST libraries 
generated for Scutopus and Wirenia were also included in this study. See 
Supplementary Methods for detailed laboratory methods. 

Sequence processing. Raw ESTs were processed and assembled using the EST2uni 
pipeline”. This software removes low-quality regions with lucy*’, removes vector 
sequences with lucy and SeqClean (http://compbio.dfci.harvard.edu/tgi/software), 
masks low complexity regions with RepeatMasker (http://www.repeatmasker. 
org), and assembles contigs with CAP3 (ref. 32). Data on sequence quality were 
used by CAP3 when available. Unigenes were translated with ESTScan*’ and 
sequences shorter than 100 amino acids were deleted. Manual BLAST searches of 
samples of unigenes for vector sequences as well as examination of contig assembly 
diagrams generated by EST2uni indicated that these programs performed well at 
removing vector and low-quality sequences and assembling contigs, respectively. 

To reduce the amount of missing data per taxon, sequences from two or more 

closely related taxa were combined to create the following 11 chimaerical OTUs: 
Chitonida, Crassostrea, Dreissena, Haliotis, Helicoidea, Loligo, Mytilus, Pectinidae, 
Pedicellina, Sipuncula and Venerupis. 
Orthology assignment and data set assembly. OG identification used HaMStR 
local 7 (ref. 26), which uses profile hidden Markov models (pHMMs) generated 
from completely sequenced reference taxa in the InParanoid database™*. Translated 
unigenes were searched against the 1,032 single-copy OGs of HaMStR’s ‘model 
organism’ pHMMs derived from Homo, Ciona, Drosophila, Caenorhabditis and 
Saccharomyces. Translated unigenes matching an OG’s pHMM were then com- 
pared to the proteome of Drosophila using BLASTP. If the Drosophila protein 
contributing to the pHMM was the best BLASTP hit, the unigene was then placed 
in that OG, 

If one of the first or last 20 characters of an amino acid sequence was an X 
(corresponding to a codon with an ambiguity, gap, or missing data), all characters 
between the X and that end of the sequence were deleted and treated as missing data. 
This step was important as ends of singletons were occasionally, but obviously, 
mistranslated. Each OG was aligned with MAFFT* using the default alignment 
strategy. Aligned OGs were then manually inspected and subjected to trimming or 
deleting of partially mistranslated sequences, screening for paralogues, and com- 
bining incomplete sequences from the same OTU into one, more complete con- 
sensus sequence. These alignments were then trimmed with Aliscore and Alicut** to 
remove regions with ambiguous alignment or little to no phylogenetic signal. Lastly, 
any alignments less than 25 amino acids in length were discarded. 

Maximum likelihood (ML) trees were inferred for each OG using RaxML 7.2.7 
(ref. 27) using the best-fitting amino acid substitution model as determined using 
the RAxML amino acid substitution model selection Perl script. OGs with strongly 
supported deep nodes suggesting the inclusion of paralogs were edited to delete 
obviously paralogous sequences or discarded. To reduce missing data in the final 
matrices, only OGs with sequences from at least ten molluscs were retained for 
analysis. 

If an OG still possessed more than one sequence from one or more OTUs 
(inparalogues), the sequence with the shortest average pairwise distance to all 
others was retained. Pairwise distances were calculated using a gamma distribution 
with four rate categories as implemented in SCaFoS”. If two or more sequences 
from the same taxon were >10% divergent, all sequences from that taxon were 
discarded from that OG. To visualize the amount of data sampled for each taxon, a 
gene sampling diagram (Fig. 3) was created using MARE (http://mare.zfmk.de). 
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Contamination screening. Neomenioids have been reported to harbour nucleic 
acid contamination from their prey**. Given this, specimens of Wirenia argentea 
(which feed on cnidarians) were starved for 2 months before RNA extraction. Gut 
content analysis of Neomenia sp. confirmed that this undescribed Antarctic species 
(see Supplementary Results) also feeds on cnidarians. Therefore, Neomenia uni- 
genes were compared to predicted transcripts of Lottia and Nematostella using 
TBLASTX and sequences with a lower E-value for Nematostella than Lottia (that 
is, sequences more similar to a sequence in the proteome of Nematostella than 
Lottia) were discarded. ML trees for each gene were manually evaluated and any 
remaining cnidarian contamination in the neomenioid data sets was removed by 
deleting sequences which either formed a clade with Nematostella or were part of a 
polytomy that included Nematostella. Finally, Nematostella was included in ana- 
lyses with broad outgroup sampling (Supplementary Figs 3 and 4) to demonstrate 
that there is no obvious attraction between it and either neomenioid. 
Phylogenetic analyses. Phylogenetic analyses were conducted using ML in 
RAXML 7.2.7 (ref. 27) and BI in PhyloBayes 2.3 (ref. 28) on the Alabama 
Supercomputer Authority Dense Memory Cluster (http://www.asc.edu/). For 
ML analyses, the best fitting amino acid substitution model for each gene was 
determined using the RAXML model selection Perl script. This script tests the fit of 
each available model of amino acid substitution by optimizing model parameters 
and branch lengths on a JTT start tree for each OG. Additionally, for comparative 
purposes, ML analyses using one model for the entire matrix were performed 
using the WAG + CAT + F and LG + CAT + F models in RAxML (Supplemen- 
tary Figs 11 and 12) and an approximately ML analysis was performed using the 
JIT + CAT model in FastTree 2.1 (ref. 39, Supplementary Fig. 13). Topological 
robustness (that is, nodal support) for all ML analyses was assessed with 100 
replicates of nonparametric bootstrapping. Stabilities of OTUs among the boot- 
strapped trees were calculated using the leaf stability index in Phyutility”. 
Competing hypotheses of mollusc phylogeny were evaluated using the AU test*° 
with the best-fitting model for each partition. For all BI analyses, the CAT model 
was used to account for site-specific rate heterogeneity~*. Unless otherwise noted, 
all BI analyses were conducted with five parallel chains run for 15,000 cycles each, 
with the first 5,000 trees discarded as burn-in. A 50% majority rule consensus tree 
was computed from the remaining 10,000 trees from each chain. Topological 
robustness was assessed using posterior probabilities. Maxdiff values below 0.3 
indicated that all chains in a run had converged. 

Ancestral state reconstruction. Ancestral character state reconstruction was per- 
formed using an updated and modified version of the morphological matrix from 
ref. 4 in Mesquite 2.74 (http://mesquiteproject.org/) using maximum parsimony as 
the reconstruction method. 
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Antibiotic resistance is ancient 


Vanessa M. D’Costa’**, Christine E. King***, Lindsay Kalan’, Mariya Morar!?, Wilson W. L. Sung", Carsten Schwarz’, 
Duane Froese®, Grant Zazula°, Fabrice Calmels”, Regis Debruyne’, G. Brian Golding*, Hendrik N. Poinar’*** & Gerard D. Wright"? 


The discovery of antibiotics more than 70 years ago initiated a 
period of drug innovation and implementation in human and 
animal health and agriculture. These discoveries were tempered 
in all cases by the emergence of resistant microbes’. This history 
has been interpreted to mean that antibiotic resistance in patho- 
genic bacteria is a modern phenomenon; this view is reinforced by 
the fact that collections of microbes that predate the antibiotic era 
are highly susceptible to antibiotics’. Here we report targeted 
metagenomic analyses of rigorously authenticated ancient DNA 
from 30,000-year-old Beringian permafrost sediments and the 
identification of a highly diverse collection of genes encoding res- 
istance to f-lactam, tetracycline and glycopeptide antibiotics. 
Structure and function studies on the complete vancomycin resist- 
ance element VanA confirmed its similarity to modern variants. 
These results show conclusively that antibiotic resistance is a 
natural phenomenon that predates the modern selective pressure 
of clinical antibiotic use. 

Recent studies of modern environmental and human commensal 
microbial genomes have a much larger concentration of antibiotic 
resistance genes than has been previously recognized*®. In addition, 
metagenomic studies have revealed diverse homologues of known 
resistance genes broadly distributed across environmental locales. 
This widespread dissemination of antibiotic resistance elements is 
inconsistent with a hypothesis of contemporary emergence and 
instead suggests a richer natural history of resistance’. Indeed, 
estimates of the origin of natural product antibiotics range from 
2 Gyr to 40 Myr ago”®, suggesting that resistance should be similarly 
old. Previous publications claim to have cultured resistant bacteria 
from Siberian permafrost (for example ref. 9), but these results remain 
contentious (see Supplementary Information). 

To determine whether contemporary resistance elements are modern 
or whether they originated before our use of antibiotics, we analysed 
DNA sequences recovered from Late Pleistocene permafrost sediments. 
The samples were collected east of Dawson City, Yukon, at the Bear 
Creek (BC) site (Fig. 1); prominent forms of ground ice (ice wedges and 
surface icings) are preserved in the exposure, immediately overlain by a 
distinctive volcanic ash layer, the Dawson tephra'®"' (Supplementary 
Table 1 and Supplementary Figs 1 and 2). The tephra has been dated at 
several sites in the area to about 25,300 radiocarbon (40) years BP, or 
about 30,000 calendar years’””*. The cryostratigraphic context is similar 
to other sites in the area preserving relict permafrost and indicates that 
the permafrost has not thawed since the time of deposition (Sup- 
plementary Information). In the absence of fluid leaching, the site repre- 
sents an ideal source of uncontaminated and securely dated ancient 
DNA. 

Two frozen sediment cores (BC1 and BC4), 10cm apart, were 
obtained 50cm below the tephra. In accordance with appropriate 
protocols’’, we monitored contamination introduced during coring 
by spraying the drilling equipment and the outer surface of the cores 


with high concentrations of Escherichia coli harbouring the gfp (green 
fluorescent protein) gene from Aequorea victoria (Supplementary 
Information). 

After fracturing of the samples (Supplementary Fig. 3), total DNA 
was extracted from a series of five subsamples taken along the radius of 
each core (Supplementary Information). Quantitative polymerase 
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Figure 1 | Stratigraphic profile and location of Bear Creek site. Elevation is 
given in metres above base of exposure. Permafrost samples from below 
Dawson tephra were dated to about 30 kyr bp. Preservation of the ice below and 
above the sample indicates that the sediments have not thawed since 
deposition. Silhouettes represent mammals and birds identified from ancient 
DNA sequences that are typical of the regional Late Pleistocene environment. 
aDNA, ancient DNA. 
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chain reaction (qPCR) analysis confirmed extremely high yields of gfp 
on both core exteriors, with 0.1% or less of this amount at the centre 
(Supplementary Information and Supplementary Fig. 4). This sup- 
ports negligible leaching or cross-contamination during subsampling. 

A crucial step lending support for the authenticity of the ancient 
DNA was to confirm the presence of DNA derived from flora and fauna 
characteristic of a late Pleistocene age, and the absence of common 
modern or Holocene floral and faunal sources. To explore the vertebrate 
and plant diversity, we amplified fragments of the mitochondrial 12S 
rRNA and chloroplast trnL and rbcL genes (Supplementary Table 3). 
Amplicons were sequenced with the 454 GS-FLX platform and iden- 
tified by BLAST analysis of GenBank sequences (Supplementary 
Information). 

The vertebrate sequences included abundant Late Pleistocene 
megafauna such as Bison, Equus and Ovis, as well as rodents (Microtus 
and Ellobius) and the rock ptarmigan, Lagopus mutus (Supplementary 
Fig. 6 and Supplementary Table 5). Mammuthus was detectable at low 
copy numbers with the use of a mammoth-specific qPCR assay, which is 
consistent with the low ratio of these fossils relative to bison and horse in 
the region'"*. The rbcL and trnL sequences revealed many plant groups 
that are also well documented in Beringia, including the grasses Poa and 
Festuca, sage (Artemisia) and willow (Salix)'° (Supplementary Figs 7 and 
8, and Supplementary Tables 6 and 7). No sequences of common 
Holocene vertebrates (for example elk or moose) or plants (for example 
spruce) were identified despite sequence conservation across the primer- 
binding sites; these results are consistent with other reports’® that have 
argued against DNA leaching in permafrost sediments. 

We focused our investigation of bacterial 16S rRNA sequences on 
the Actinobacteria, known for their ability to synthesize diverse 
secondary metabolites and for harbouring antibiotic resistance genes’. 
Deep sequencing of 16S amplicons (Supplementary Information) 
revealed genera commonly found in soil and permafrost microbial 
communities!’, including Aeromicrobium, Arthrobacter and Frankia 
(Supplementary Fig. 9 and Supplementary Table 8). Analysis of con- 
taminant 16S sequences derived from extraction and PCR control re- 
actions (Supplementary Table 4) suggested that these do not 
contribute to the ancient DNA data set; in fact not only were the copy 
numbers 1,000-30,000-fold lower than from the permafrost extracts, 
but with the exception of unclassified bacteria there was also very 
little overlap in the genera identified (Supplementary Fig. 9 and 
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soil bacterium 
bla 


Kinetococcus se 


radiotolerans 
bla 


Bot 


Supplementary Table 8). Querying the permafrost sequences against 
the contaminant data set with the use of BLAST further confirmed 
their disparity: only 1% of the reads had 95-100% identity to a con- 
taminant sequence, with a single sequence showing 100% identity. 

We next developed a series of assays to detect genes encoding resist- 
ance to several major classes of antibiotic and representing diverse 
strategies of drug evasion (for example target modification, target pro- 
tection and enzymatic drug inactivation) (Supplementary Information). 
Determinants included the ribosomal protection protein TetM, which 
confers resistance to tetracycline antibiotics by weakening the inter- 
action between the drug and the ribosome; the p-Ala-p-Ala dipeptide 
hydrolase VanX, which is a component of the vancomycin resistance 
operon; the aminoglycoside-antibiotic-modifying acetyltransferase 
AAC(3); a penicillin-inactivating B-lactamase Bla (a member of the 
TEM group of f-lactamases); and the ribosome methyltransferase 
Erm, which blocks the binding of macrolide, lincosamide and type B 
streptogramin antibiotics. Amplification of vanX, tetM and bla frag- 
ments was successful, and triplicate PCR products from multiple 
extracts were cloned and multiple clones were sequenced. 

The f-lactamase sequences demonstrated amino-acid identities 
between 53% and 84% with known determinants and clustered with 
one of two groups of enzymes: characterized B-lactamases from strepto- 
mycetes and uncharacterized f-lactamase-like hydrolytic proteins 
(Fig. 2a and Supplementary Fig. 14). We identified several tetM-related 
genes in the permafrost, most of which were most closely related to the 
actinomycete subset of ribosomal protection proteins, including the 
biochemically characterized self-resistance element OtrA from the 
oxytetracycline producer Streptomyces rimosus'* (Fig. 2b). Most intri- 
guing was the identification of vanX gene fragments, which spanned 
the entire phylogenetic space of characterized vancomycin resistance 
determinants found in the clinic and in the environment. These branch 
away from the cellular dipeptidases that are the likely progenitors the 
vanX family (Supplementary Fig. 10). 

Vancomycin resistance took the clinical community by surprise 
when it emerged in pathogenic enterococci in the late 1980s’”. In both 
clinical pathogens”’ and contemporary soil environments’, resistance 
results from the acquisition of a three-gene operon vanH-vanA-vanX 
(vanHAX). These enzymes collectively reconstruct bacterial peptido- 
glycan to terminate in D-alanine-D-lactate in place of the canonical 
D-alanine-D-alanine, which is required for vancomycin binding and 
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Figure 2 | Genetic diversity of ancient antibiotic resistance elements. 

a, b, Unrooted Bayesian phylogenies of translated B-lactamase (bla) (a) and 
tetracycline resistance (tetM) (b). Blue denotes predicted resistance enzymes, 
and green those associated with other functions; permafrost-derived sequences 
are labelled with the originating core name. Sequences in which resistance 
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0.1 


activity has been biochemically verified are noted with a single asterisk 
(Supplementary Information). The scale bar represents 0.1 substitutions per 
site. Posterior probabilities are shown for a, and those of 0.7 or more are 
indicated for b. All unlabelled tips derive from ancient sequences. BC1, Bear 
Creek sample 1; BC4, Bear Creek sample 4. 
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subsequent antibiotic action. Although most forms of resistance are 


attributed to a single gene, this complex mechanism is exclusively 


associated with resistance and thus its presence provides unambiguous 
confirmation of its role as a resistance determinant. 
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Figure 3 | Ancient vancomycin resistance elements. a, vanHAX amplicons 
used in this study, with primer names noted above each arrow. b, Unrooted 
Bayesian phylogeny of translated vanA sequences; blue denotes strains with 
vanHAX clusters confirmed to confer resistance; sequences containing stop 
codons but homology throughout are noted with a single asterisk 
(Supplementary Information). BC1, Bear Creek sample 1; BC4, Bear Creek 
sample 4. c, VanA a, structure. Left: ribbon diagram of the VanA q, dimer (blue) 
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With few exceptions, the vanHAX operon is invariant in genetic 
organization; it therefore offers a matchless template for confirming its 


presence with PCR assays that span the vanHA and vanAX boundaries. 
Two short qPCR assays were designed to confirm this contiguity 
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overlaid with modern VanA (green), where the Q-loop is coloured red; right: 
ball-and-stick representation of ATP binding. The electron density shown is an 
F, — F, map contoured at 30. d, Comparison of modern and ancient VanA 
monomer structures. The Q-loop is coloured red and detailed in the ball-and- 
stick figures. Ligands are shown in grey. Dashed lines represent hydrogen 
bonds. 
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Table 1 | vanHAX permutation tests 


Amplicon umber Length (base pairs) Probability of similarity by chance alone to Streptomyces coelicolor genes 

vanH vanA vanX 
H1A1 164 203-213 3.59 x 10-3 439x101’ 0.24 
H1A1* 12 209-216 2.83 x 10°3 8.16 x 10 16 0.28 
H2A3 24 573-605 9.83 x 10°3 1.27 x 10-54 0.22 
H2A4 79 666-681 4.33 x 10°3 6.15 x 10 °° 0.18 
A6X 159 170-179 0.11 6.87 x 108 5.64 x 10° 
A6X* 11 176-179 0.04 2.96 x 10 8 3.63 x 10° 
A2Xx 96 735-796 0.11 1.80 x 10°°9 1.35 x 10° 
HAX+ 40 1,173-1,204 5.95 x 10° 9.32 x 10 % 6.47 x 10°” 


* Clones from independent replication in France. Includes both H1AX and H2AX. 


(Fig. 3a and Supplementary Information). Positive results, including 
particularly high yields of the smallest amplicon, A6X (Supplemen- 
tary Table 9), encouraged us to attempt amplification across both 
boundaries (that is, the complete vanA gene) in a single 1.2-kilobase 
amplicon. We also targeted fragments anchored on either boundary 
and extending as far as possible into vanA. None of the sequences from 
these products, or those generated by an independent laboratory (Sup- 
plementary Information), were present in GenBank. No contaminants 
were detected in more than 300 control reactions. 

Phylogenetic analyses showed that many of the ancient vanHAX 
sequences cluster with characterized glycopeptide-resistant strains of 
Actinobacteria containing vanHAX cassettes (for example streptomy- 
cetes, glycopeptide-producing Amycolatopsis species and the nitrogen- 
fixing Frankia sp. EANIpec) (Fig. 3b and Supplementary Figs 11 and 
12). Another group falls between the actinobacterial sequences and the 
Firmicutes-derived cluster, which includes environmental Paenibacillus 
isolates and the pathogenic Enterococci, and may reflect an intermediate 
group. 

Permutation tests were performed with the PRSS algorithm”® (1,000 
permutations each) to confirm that the sequences were statistically 
similar to those of vancomycin resistance genes (vanHAX) present 
in modern Streptomyces. As shown in Table 1, all vanHA-spanning 
clones have significant similarity to vanH and vanA, and all vanAX- 
spanning clones have significant similarity to vanA and vanx. 

To ascertain whether the complete vanA sequences are indeed func- 
tional and do not represent PCR artefacts or pseudogenes, we synthe- 
sized four open reading frames from the 40 H1AX/H2AX sequences 
(Supplementary Information). Two of these generated soluble proteins 
suitable for purification to homogeneity. Enzymatic characterization 
indicated that these ligases were indeed D-alanine-D-lactate-specific 
(Supplementary Fig. 13), and analysis revealed steady-state kinetic 
parameters consistent with contemporary enzymes derived from both 
the clinic and the environment (Supplementary Table 10). These 
results clearly show that the vanHAX genes identified in the ancient 
samples encode enzymes capable of genuine antibiotic resistance. 

We further confirmed the link between 30,000-year-old VanA and 
contemporary enzymes by determining the three-dimensional struc- 
ture of VanA q2 by X-ray crystallography (Supplementary Table 11 and 
Supplementary Information). The quaternary and tertiary structures 
of VanAa,, crystallized in the ATP-bound form, show the overall 
p-Ala-D-X ligase fold of modern enzymes including VanA from 
vancomycin-resistant Enterococcus faecium (Fig. 3c, d). Superposition 
of ancient and modern VanA (Fig. 3c,d) reveals conservation of 
quaternary and tertiary structure with minor differences in Mg”* 
and ATP y-phosphate coordination. The Q-loop comprises the biggest 
structural change; 13 amino-terminal residues (233-246) are absent 
from the electron density map of VanA 42, including His 241 (His 244 
in modern VanA), responsible for the lactate selectivity. The last seven 
Q-loop residues (247-253) have clear electron density, undergoing a 
drastic 13A shift. These structural differences, however, are not 
reflected in enzyme function. 

This work firmly establishes that antibiotic resistance genes predate 
our use of antibiotics and offers the first direct evidence that antibiotic 
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resistance is an ancient, naturally occurring phenomenon widespread 
in the environment. This is consistent with the rapid emergence of 
resistance in the clinic and predicts that new antibiotics will select for 
pre-existing resistance determinants that have been circulating within 
the microbial pangenome for millennia. This reality must be a guiding 
principle in our stewardship of existing and new antibiotics. 


METHODS SUMMARY 


Permafrost cores were collected at Bear Creek, Yukon, then shipped frozen to the 
McMaster Ancient DNA Centre and stored at —40°C. All subsequent procedures 
before PCR/qPCR amplification were performed in dedicated clean rooms, physically 
separated from laboratories containing modern DNA, bacterial cultures and amp- 
lification products. Contaminant leaching into the centre of cores after sampling was 
monitored by qPCR assays designed to detect E. coli DNA encoding the jellyfish green 
fluorescent protein sprayed onto coring equipment and the external surfaces of all 
collected cores. DNA was extracted from the centre of subsampled permafrost cores. 
PCR assays were designed to target vertebrates, plants, bacteria and specific antibiotic 
resistance elements. All products were sequenced with either the 454 GS-FLX 
platform or by standard cloning and sequencing procedures (GenBank accession 
numbers JN316287-JN366376). The ancient vanA gene identified from the 
permafrost was synthesized and expressed in E. coli, and the His¢-tagged protein 
was purified by immobilized metal-affinity chromatography. This protein was 
used in enzymatic studies to determine steady-state kinetics and was also studied 
by crystallography using the vapour-diffusion hanging-drop method. Data were 
collected at the National Synchrotron Light Source, Brookhaven National 
Laboratory, beamline X25 (PDB 1E4E). 


Received 28 March; accepted 22 July 2011. 
Published online 31 August 2011. 


1. Livermore, D. M. Has the era of untreatable infections arrived? J. Antimicrob. 
Chemother. 64, i29-i36 (2009). 

2. Wright, G. D. The antibiotic resistome: the nexus of chemical and genetic diversity. 
Nature Rev. Microbiol. 5, 175-186 (2007). 

3. Hughes, V. M. & Datta, N. Conjugative plasmids in bacteria of the ‘pre-antibiotic’ 
era. Nature 302, 725-726 (1983). 

4. D’Costa, V.M., McGrann, K. M., Hughes, D. W. & Wright, G. D. Sampling the 
antibiotic resistome. Science 311, 374-377 (2006). 

5. Dantas, G., Sommer, M.0.A., Oluwasegun, R. D. & Church, G. M. Bacteria subsisting 
on antibiotics. Science 320, 100-103 (2008). 

6. Sommer, M. O.A., Dantas, G. & Church, G. M. Functional characterization of the 
antibiotic resistance reservoir in the human microflora. Science 325, 1128-1131 
(2009). 

7.  Baltz, R. H. Antibiotic discovery from actinomycetes: will a renaissance follow the 

decline and fall? SIM News 55, 186-196 (2005). 

8. Hall, B. G. & Barlow, M. Evolution of the serine B-lactamases: past, present and 

future. Drug Resist. Updat. 7, 111-123 (2004). 

9. indlin, S. Z., Soina, V. S., Petrova, M. A. & Gorlenko, Z. M. Isolation of antibiotic 

resistance bacterial strains from Eastern Siberia permafrost sediments. Russ. J. 

Genet. 44, 27-34 (2008). 

10. Froese, D.G., Zazula, G. D. & Reyes, A. V. Seasonality of the late Pleistocene Dawson 

ephra and exceptional preservation of a buried riparian surface in central Yukon 

Territory, Canada. Quat. Sci. Rev. 25, 1542-1551 (2006). 

11. Froese, D. G. et al. The Klondike goldfields and Pleistocene environments of 

Beringia. GSA Today 19, 4-10 (2009). 

12. Brock, F., Froese, D. G. & Roberts, R. G. Low temperature (LT) combustion of 

sediments does not necessarily provide accurate radiocarbon ages for site 

chronology. Quat Geochronol. 5, 625-630 (2010). 

13. Willerslev, E., Hansen, A. J. & Poinar, H. N. Isolation of nucleic acids and cultures 

from fossil ice and permafrost. Trends Ecol. Evol. 19, 141-147 (2004). 

14. Harington, C.R. & Clulow, F. V. Pleistocene mammals from Gold Run Creek, Yukon 

Territory. Can. J. Earth Sci. 10, 697-759 (1973). 

15. Zazula, G. D. et al. lce-age steppe vegetation in east Beringia. Nature 423, 603 
(2003). 


©2011 Macmillan Publishers Limited. All rights reserved 


16. Haile, J. et a/. Ancient DNA reveals late survival of mammoth and horse in interior 
Alaska. Proc. Natl Acad. Sci. USA 106, 22352-22357 (2009). 

17. Gilichinsky, D. et a/. in Psychrophiles: From Biodiversity to Biotechnology 
(eds Margesin, R., Schinner, F., Marx, J.-C. & Gerday, C.) 83-102 (Springer, 2008). 

18. Doyle, D., McDowall, K. J., Butler, M. J. & Hunter, |. S. Characterization of an 
oxytetracycline-resistance gene, otrA, of Streptomyces rimosus. Mol. Microbiol. 5, 
2923-2933 (1991). 

19. Courvalin, P. Vancomycin resistance in gram-positive cocci. Clin. Infect. Dis. 42, 
S$25-S34 (2006). 

20. Pearson, W.R.& Lipman, D.J. Improved tools for biological sequence comparison. 
Proc. Nat! Acad. Sci. USA 85, 2444-2448 (1988). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank A. Guarné for assistance in X-ray data collection. This 
work was supported by Canada Research Chairs to D.F., H.N.P. and G.D.W., a Canadian 
Institutes of Health Research operating grant to G.D.W. (MOP-79488) and ascholarship 
to V.M.D.,and by grants from the Natural Sciences and Engineering Research Council of 
Canada to D.F. and H.N.P. and scholarship to C.E.K. 


LETTER 


Author Contributions D.F., G.Z. and F.C. collected permafrost cores and performed 
geochemical analyses followed by subsampling by C.S., V.M.D. and C.E.K. C.E.K 
performed ancient DNA laboratory work and 454 sequencing. V.M.D. designed primers 
for resistance genes, 16S and gfp. V.M.D. and C.E.K. designed and optimized the qPCR 
assays, and cloned and sequenced the resistance gene products. R.D. independently 
confirmed the results in France. L.K. purified and characterized VanA, and M.M. 
crystallized VanA and determined the three-dimensional structure. W.S., G.B.G., C.E.K. 
and H.N.P. processed and analysed the floral/faunal data; V.M.D. and G.B.G. performed 
phylogenetic and bioinformatic analyses of the resistance gene sequences. H.N.P. and 
G.D.W. conceived the project, and V.M.D., C.E.K., D.F., H.N.P. and G.D.W. wrote the 
manuscript. All authors edited the final draft. 


Author Information The metagenomic sequences determined from permafrost are 
deposited in GenBank under accession numbers JN316287-JN366376. Reprints and 
permissions information is available at www.nature.com/reprints. The authors declare 
no competing financial interests. Readers are welcome to comment on the online 
version of this article at www.nature.com/nature. Correspondence and requests for 
materials should be addressed to G.D.W. (wrightge@mcmaster.ca) or H.N.P. 
(poinarh@memaster.ca). 


22 SEPTEMBER 2011 | VOL 477 | NATURE | 461 


©2011 Macmillan Publishers Limited. All rights reserved 


| sid ial a 


doi:10.1038/nature10392 


Evidence for several waves of global transmission in 
the seventh cholera pandemic 
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Vibrio cholerae is a globally important pathogen that is endemic in 
many areas of the world and causes 3-5 million reported cases of 
cholera every year. Historically, there have been seven acknowledged 
cholera pandemics; recent outbreaks in Zimbabwe and Haiti are 
included in the seventh and ongoing pandemic’. Only isolates in 
serogroup O1 (consisting of two biotypes known as ‘classical’ and 
‘El Tor’) and the derivative 0139 (refs 2, 3) can cause epidemic 
cholera’. It is believed that the first six cholera pandemics were 
caused by the classical biotype, but El Tor has subsequently spread 
globally and replaced the classical biotype in the current pandemic’. 
Detailed molecular epidemiological mapping of cholera has been 
compromised by a reliance on sub-genomic regions such as mobile 
elements to infer relationships, making El Tor isolates associated 
with the seventh pandemic seem superficially diverse. To understand 
the underlying phylogeny of the lineage responsible for the current 
pandemic, we identified high-resolution markers (single nucleotide 
polymorphisms; SNPs) in 154 whole-genome sequences of globally 
and temporally representative V. cholerae isolates. Using this phylo- 
geny, we show here that the seventh pandemic has spread from the 
Bay of Bengal in at least three independent but overlapping waves 
with a common ancestor in the 1950s, and identify several transcon- 
tinental transmission events. Additionally, we show how the acquisi- 
tion of the SXT family of antibiotic resistance elements has shaped 
pandemic spread, and show that this family was first acquired at least 
ten years before its discovery in V. cholerae. 

Whole-genome analysis is perhaps the ultimate approach to build- 
ing a robust phylogeny in recently emerged pathogens, through the 
identification of SNPs and other rare genetic variants’. Therefore, we 
sequenced the genomes of 136 isolates of V. cholerae, the causative 
agent of several million cholera cases each year (http://www.who.int/ 
mediacentre/factsheets/fs107/en/). These sequences, including 113 
isolates from the seventh pandemic, were added to 18 previously pub- 
lished genomes'”* to produce a global genomic database from isolates 
collected in the course of a century. We included representative El Tor 
isolates collected in the past four decades and compared these to 
previously reported and novel genome sequences of both classical 
and non-Ol types’. 

The sequence reads were mapped to the reference sequence of El 
Tor N16961 (ref. 6), a seventh-pandemic V. cholerae that was isolated 
in Bangladesh in 1975 (see footnote to Supplementary Table 1) and the 
resulting consensus tree identified eight distinct phyletic lineages (L1- 
L8, see Supplementary Fig. 1 and Supplementary Table 1 for strain and 
lineage information), six of which incorporated O1 clinical isolates. 
The classical isolates formed a distinct, highly clustered group (L1), 
distant from the El Tor isolates of the seventh pandemic (L2). It is clear 


from Supplementary Fig. 1 that the classical and El Tor clades did not 
originate from a recent common ancestor and instead seem to be 
independent derivatives with distinct phylogenetic histories, consist- 
ent with previous proposals’. Isolates of L4 share a common ancestor 
with previously reported non-conventional O1 isolates* (Supplemen- 
tary Fig. 2), and are likely to have acquired the O1 antigen genes by a 
recombination event onto a genetically distinct genome backbone. 
Isolates of L7 also have a distinct backbone, whereas L2, L3 (USA gulf 
coast strains), L5, L6 and L8 share a more ‘El-Tor-like’ genome back- 
bone, and the L1 backbone is of the ‘classical’ type. 

Genome-wide SNP analysis showed that the 123 El Tor isolates in 
the L2 cluster (Supplementary Fig. 1) differed from the reference by 
only 50-250 SNPs. With this large sample size we were able to con- 
struct a high-resolution phylogeny that shows unequivocally that the 
current pandemic is monophyletic and originated froma single source, 
providing a framework for future epidemiological and phenotypic 
analysis of V. cholerae, including transmission-tracking and typing. 

Predicted recombined regions were identified, and along with geno- 
mic islands and mobile genetic elements, these were initially excluded 
from the phylogenetic analysis of seventh-pandemic isolates, to deter- 
mine the underlying phylogeny. Notably, analysis of the tree (Fig. 1; see 
Supplementary Fig. 3 for a tree with strain names) provides clear 
evidence of a clonal expansion of the lineage, with a strong temporal 
signature. This is most clearly illustrated by the fact that the most 
divergent isolates from the N16961 reference are represented by the 
oldest seventh-pandemic isolate in our collection, A6, collected in 
1957, together with the most recent Haitian isolates’ from late 2010. 
We performed a linear regression analysis on all the L2 isolates to 
calculate the rate of SNP accumulation on the basis of the date of 
isolation and the root-to-tip distance. The shape of the tree and tem- 
poral signatures in Fig. 1 showa very consistent rate of SNP accumula- 
tion, 3.3SNPs year ' (R? = 0.73, Supplementary Fig. 4) in the core 
genome, emphasizing the tree’s robustness and utility for transmis- 
sion studies. The only exception to this is V. cholerae A4, a repeatedly 
passaged laboratory strain that was originally isolated in 1973 
(Supplementary Figs 3 and 4). The estimated rate of mutation 
for our seventh-pandemic V.cholerae collection was 8.3 X10’ 
SNPs site ‘year ': between 5 and 2.5 times slower than the rate esti- 
mated for recent clonal expansions of some other human-pathogenic 
bacteria*”’. 

The seventh-pandemic tree can be subdivided into three major 
groups or clades by clustering using Bayesian analysis of population 
structure*’ (shown as waves 1-3 in Fig. 1); this clustering is mostly 
consistent with the cholera toxin (CTX) type of the three clades, which 
represent independent waves of transmission. Although examples of 
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Figure 1 | A maximum-likelihood phylogenetic tree of the seventh 
pandemic lineage of V. cholerae based on SNP differences across the whole 
core genome, excluding probable recombination events. The pre-seventh- 
pandemic isolate M66 was used as an outgroup to root the tree. Branches are 
coloured on the basis of the region of isolation of the strains. The branches 
representing the three major waves are indicated on the far right. The nodes 
representing the MRCAs of the seventh pandemic, and subsequent waves 2 and 
3, are indicated with arrows and labelled with inferred dates. The presence and 


0.02 


genetic determinants differentiating these three CTX types have previ- 
ously been published", they have not been put into a phylogenetic 
context, undermining efforts to investigate the evolutionary aspects of 
their emergence. Perhaps as a result, there has been substantial uncer- 
tainty in naming new CTX types as they have been discovered. Our 
data shows that the first CTX type is canonical CTX El Tor and we 
propose that it is renamed CTX-1; for the other two we propose a new 
expandable nomenclature and class them as CTX-2 and CTX-3 
(Supplementary Table 2). 

Isolates spanning A18 to PRLS (the lower clade in Fig. 1) represent 
wave 1, covering about 16 years (1977-1992). All isolates in this group 
lack the integrative and conjugative element (ICE) of the SXT/R391 
family, encoding resistance to several antibiotics’. It is within this 
time period that seventh-pandemic cholera occurred in South 
America®. Our data show that the South American isolates form a 
discrete cluster, which also includes a single Angolan isolate collected 
in 1989. The position of the Angolan isolate at the base of the South 
American group indicates that transmission to South America may 
have been via Africa, as previously proposed’*. We used BEAST to 
translate evolutionary distance in SNPs into time (Supplementary 


Chr.2 = SXT 


type of CTX and SXT elements in each strain are shown to the right of the tree. 
The presence of toxin-linked cryptic (TLC) and repeated sequence 1 (RS1) 
elements is shown, but their number and position, respectively, are arbitrarily 
assigned. Cases of sporadic intercontinental transmission are marked A—-D. 
The dates shown are the median estimates for the indicated nodes, taken from 
the results of the BEAST analysis. The scale is given as the number of 
substitutions per variable site; asterisks indicate that no data were available. 


Fig. 5) and this indicated that transmission to South America is likely 
to have occurred between 1981 and 1985. The branch harbouring this 
West African-South American (WASA) clade is distinguished from all 
other V. cholerae by the acquisition of novel VSP-2 genes’* and a novel 
genomic island that we have denoted WASA1 (Supplementary Table 3). 
Notably, the Angolan isolate A5 and all the South American isolates are 
discriminated by just ten SNPs. Based on the accumulation rate of 
3.3 SNPs year ' (Supplementary Fig. 4), the 3-year time period between 
the isolation of A5 and the oldest South American isolate included in 
this study, A32, is consistent with previous studies indicating that 
cholera spread as a single epidemic”’. 

The first acquisition of an SXT/R391 ICE lies at the point of transi- 
tion from the wave-1 cluster to the wave-2 cluster. Using our dated 
phylogeny (Supplementary Fig. 5)’*, we were able to date this transi- 
tion and the first acquisition of SXT/R391 ICE to 1978-84, ten years 
before its discovery in O139 strains, which also fits with the otherwise 
surprising discovery of SXT in a Vietnamese strain isolated before 
1992 (ref. 16). This date would also correspond to the most recent 
common ancestor (MRCA) of the O1 and 0139 serogroup isolates. 
Analysis of the diversity of the common regions of SXT/R391 ICEs in 
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our seventh-pandemic collection (Supplementary Fig. 6) shows that 
they are discriminated by 3,161 SNPs, compared to only 1,757 SNPs 
used to define the core whole-genome phylogeny in Fig. 1. This indi- 
cates either that there have been several recombination events within 
these ICEs, or that they have been acquired independently several 
times on the tree’’. Isolates from wave 2 represent a discrete cluster 
that shows a complex pattern of accessory elements in the CTX locus 
(Fig. 1) and a wide phylogeographical distribution. It is also notable 
that isolates collected in Vietnam in 1995-2004 and strain A109 are the 
only wave-2 isolates studied from this time period that lack an SXT/ 
R391 ICE. We examined the genomic locus in these clones that marks 
the point of insertion of SXT/R391 ICE in all other V. cholerae isolates 
and found no remnants of this conjugative element, which may have 
been lost from this lineage (no ‘scar’ in DNA sequence is expected after 
the precise excision of SXT/R391 ICE). 

Ignoring the CTX-related genomic regions, the seventh-pandemic 
12 isolates show relatively little evidence of recombination either 
within or from outside the tree. On the basis of the SNP distribution, 
1,930 out of 2,027 SNPs (Supplementary Table 4) are congruent with 
the tree, leaving 97 homoplasies that could be due to selection or 
homologous recombination among the L2 isolates. Only 270 SNPs 
were predicted to be due to homologous recombination from outside 
the tree. The only two branches in which the SNP distribution indi- 
cated considerable recombination were those leading to the WASA 
cluster (Supplementary Fig. 7) and the O139 serogroup. Aside from 
the acquisitions of CTX and the SXT/R391 ICEs, we found evidence of 
gene flux affecting only 155 other genes (Supplementary Figs 8 and 9 
and Supplementary Table 3). 

Also represented in our collection are two isolates of serogroup 
0139, which are known to have arisen from a homologous replace- 
ment of their O-antigen determinant into an El Tor genomic back- 
bone**”*. CTX types that are different from El Tor, classical, CTX-2 
and CTX-3 have been reported for the 0139 serogroup’’ *°; however, 
the phylogenetic position of the two strains included in this study 
shows that 0139 was derived from O1 El Tor and therefore represents 
another distinct but spatially restricted wave from the common source. 

We were also able to date the ancestor of the El Tor seventh- 
pandemic lineage, L2, as having existed in 1827-1936 (Supplemen- 
tary Fig. 5), which is consistent with the predicted date of origin from 
the linear regression plot (1910, Supplementary Fig. 4). This also 


corresponds well with the date of isolation of the first El Tor biotype 
strain in 1905 (ref. 21). 

It is apparent from Fig. 1 that V. cholerae wave 1, which spread 
globally, was later replaced by the more geographically restricted wave 
2 and wave 3, a phenomenon supported by local clinical observations 
and phage analysis’®. This also reflects the fact that V. cholerae epi- 
demics since 2003-2010 have been restricted to Africa and south Asia. 
Notably, the rates of SNP accumulation calculated independently for 
wave 1, wave 3 and wave 2 (2.3, 2.6 and 3.5 SNPs year | respectively) 
are consistent with the rate calculated over the whole collection period 
(Supplementary Fig. 4). 

The clonal clustering of L2 isolates, the constant rate of SNP accu- 
mulation and the temporal and geographical distribution support the 
concept that the seventh pandemic has spread by periodic radiation from 
a single source population located in the Bay of Bengal, followed by local 
evolution and ultimately local extinction in non-endemic areas. This is 
evidenced by the disappearance of wave-1 isolates, followed by the inde- 
pendent expansion of waves 2 and 3, both derived from the same original 
population, occurring within seven years of each other. These two waves 
are clearly distinguished from the first by the acquisition of SXT/R391 
ICEs (Fig. 1). Plotting the intercontinental spread of each wave onto the 
world map (Fig. 2) clearly shows that the V. cholerae seventh pandemic is 
sourced from a single, restricted geographical location but has spread in 
overlapping waves. In these ancestral waves, there are at least four recent 
long-range transmission events (A—D in Fig. 1), in which isolates clearly 
share a common ancestor with recent strains at distant locations, indi- 
cating that such events are not uncommon. The most recent example of 
this is the Haitian outbreak, in which strains share a very recent common 
ancestor with south-Asian strains at the tip of wave 3. The number of 
SNP differences, even at whole-genome resolution, between the Haitian 
and the most closely related Indian and Bangladeshi strains is very low. 
This demonstrates that the Haitian strains must have come from south 
Asia, at most within the last six years. However, the limited discrimina- 
tion means that it may prove challenging to make country-specific infer- 
ences as to the origins of the Haitian strains on the basis of DNA 
sequence alone. For such conclusions to be robust, great care must be 
taken in the selection of samples for analysis. 

Despite clear evidence of sporadic long-range transmission events that 
are likely to be associated with direct human carriage, the overall pattern 
seen in our data is one of continued local evolution of V. cholerae in the 


—75 see me as 
198687 
ae —™ = 


Figure 2 | Transmission events inferred for the seventh-pandemic 
phylogenetic tree, drawn on a global map. The date ranges shown for 
transmission events are taken from the BEAST analysis, and represent the 
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median values for the MRCA of the transmitted strains (later bound), and the 
MRCA of the transmitted strains and their closest relative from the source 
location (earlier bound). 
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Bay of Bengal, with several independent waves of global transmission 
resulting in short-term epidemics in non-endemic countries. Although 
our sample set is substantial, there are clearly areas where geographical 
coverage is limited. However, the structure of the tree, with deep 
branches between the major waves, means that increasing the number 
of strains and the resolution further should only identify further inde- 
pendent waves of transmission. Indeed, we cannot rule out the possibility 
of an El Tor population persisting or evolving as a new wave of the 
seventh pandemic; for example, in areas such as China that were not 
sampled in this study. 

One notable factor in the ongoing evolution of pandemic cholera 
was the acquisition of the SXT/R391-family antibiotic resistance ele- 
ment. The clinical use of the antibiotics tetracycline and furazolidone 
for cholera treatment started in 1963 and 1968 respectively, about 
15 years before our prediction of the first acquisition of an SXT/ 
R391 ICE (1978-1984). Our analysis provides a robust framework 
for elucidating the evolution of the seventh pandemic further, and 
for studying the local evolution, particularly in the Bay of Bengal, that 
has such a key role in the evolution of cholera. 


METHODS SUMMARY 


Genomic libraries were created for each sample, followed by multiplex sequencing 
on an Illumina GAIIx analyser. The 54-base paired-end reads obtained were 
mapped against N16961 El Tor as a reference and SNPs in the core genome were 
identified as described in Methods. The SNPs were used to draw a whole core- 
genome phylogeny as described in ref. 4. The final SNP alignment was used to 
perform BEAST" analysis and to confirm the output of linear regression analysis. 
The three cholera waves reported in the seventh-pandemic phylogeny were con- 
firmed using BAPS*°. The raw Illumina data were also assembled de novo (see 
Methods) so that pairwise genome comparisons could be made. A new and 
expandable nomenclature system describing the CTX trends seen in the last 
40 years was proposed following the rationale described in Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Genomic library creation and multiplex sequencing. Unique index-tagged 
libraries for each sample were created, and up to 12 separate libraries were 
sequenced in each of eight channels in Illumina Genome Analyser GAII cells with 
54-base paired-end reads. The index-tag sequence information was used for 
downstream processing to assign reads to the individual samples’. 
Detection of SNPs in the core genome. The 54-base paired-end reads were 
mapped against the N16961 El Tor reference (accession numbers AE003852 
and AE003853) and SNPs were identified as described in ref. 7. The unmapped 
reads and the sequences that were not present in all genomes were not considered a 
part of the core genome, and therefore SNPs from these regions were not included 
in the analysis. Appropriate SNP cutoffs were chosen to minimize the number of 
false-positive and false-negative calls; SNPs were filtered to remove those at sites 
with a SNP quality score lower than 30, and SNPs at sites with heterogeneous 
mappings were filtered out if the SNP was present in fewer than 75% of reads at 
that site. From the seventh-pandemic data set, high-density SNP clusters indi- 
cating possible recombination were excluded’. In total, 2,027 SNPs were detected 
in the core genome of the El Tor lineage. Of these, 270 SNPs were predicted to be 
due to recombination. Removing these provided a data set characterized by 
1,757 SNPs: these were used to produce the final phylogeny. 
Comparative genomics. Raw Illumina data were split to generate paired-end 
reads, and assembled using a de novo genome-assembly program, Velvet v0.7.03 
(ref. 22), to generate a multi-contig draft genome for each of 133 V. cholerae 
strains’. The overlap parameters were optimized to give the highest N50 value. 
Because seventh-pandemic V. cholerae strains are closely related in the core, 
Abacas”’ was used to order the contigs using the N16961 El Tor strain as a 
reference, followed by annotation transfer from the reference strain to each draft 
genome’. Using the N16961 sequence as a database to perform a TBLASTX™ for 
each draft genome, a genome comparison file was generated that was subsequently 
used in the Artemis comparison tool’* to compare the genomes manually and 
search for novel genomic islands. 
Phylogenetic analysis. A phylogeny was drawn for V. cholerae using RAxML 
v0.7.4 (ref. 26) to estimate the trees for all SNPs called from the core genome. 
The general time-reversible model with gamma correction was used for among- 
site rate variation for ten initial trees’. USA gulf coast strains A215 and A325, 
which have substantially different core genomes from all other strains in our 
collection, were used as an outgroup to root the global phylogeny (Supplemen- 
tary Fig. 1), whereas a pre-seventh-pandemic strain, M66 (accession numbers 
CP001233 and CP001234), and strain A6 (from our collection), were used to root 
the seventh-pandemic phylogenetic tree (Fig. 1). 
CTX prophage analysis. For each strain, the CTX structure and the sequence of 
rstA, rstR and ctxB was determined as in refs 27 and 28. 
Linear regression and Bayesian analysis. The phylogram for the seventh 
pandemic was exported to Path-O-Gen v1.3 (http://tree.bio.ed.ac.uk/software/ 
pathogen) and a linear regression plot for isolation date versus root-to-tip distance 
was generated. The same plot was also constructed individually for the three 
waves, but A4, being a laboratory strain, was excluded from the latter analysis. 
The presence of three waves was checked, and their makeup was determined, 
using a BAPS analysis performed on the SNP alignment containing the unique 
SNP patterns from the seventh-pandemic isolates. The program was run using the 
BAPS individual mixture model and three independent iterations were performed 
using an upper limit for the number of populations of 20, 21 and 22 to obtain 


optimal partitioning of the sample. The dates for the acquisition of SXT and the 
ancestors of the three waves were inferred using the Bayesian Markov chain Monte 
Carlo framework BEAST”. We used the final SNP alignment with recombinant 
sites removed and fixed the tree topology to the phylogeny produced by RAxML, 
as described above. We used BEAST to estimate the rates of evolution on the 
branches of the tree using a relaxed molecular clock"*, which allows rates of 
evolution to vary amongst the branches of the tree. BEAST produced estimates 
for the dates of branching events on the tree by sampling dates of divergence 
between isolates from their joint posterior distribution, in which the sequences 
are constrained by their known date of isolation. The data were analysed using a 
coalescent constant population size and a general time-reversible model with 
gamma correction. The results were produced from three independent chains of 
50 million steps each, sampled every 10,000 steps to ensure good mixing. The first 
5 million steps of each chain were discarded as a burn-in. The results were com- 
bined using Log Combiner, and the maximum clade credibility tree was generated 
using Tree Annotator, both parts of the BEAST package (http://tree.bio.ed.ac.uk/ 
software/beast/). Convergence and the effective sample-size values were checked 
using Tracer 1.5 (available from http://tree.bio.ed.ac.uk/software/tracer). ESS 
values in excess of 200 were obtained for all parameters. 

Nomenclature. The seventh-pandemic cholera strains were clearly distinguished 
by three waves and we therefore propose their CTX types to be CTX-1, CTX-2 and 
CTX-3 under the new nomenclature scheme (see Supplementary Table 2). Our 
nomenclature system is expandable and would be suitable for naming any new 
seventh-pandemic V. cholerae strains. With CTX-1 representing canonical El Tor, 
we followed the rationale: (1) For CTX-1 to CTX-2, because there was a shift of 
PstREET to rsfROPS, pst AELTO® to psp A Classical + ELTor and cpxBE! TO to ctx BClassical, 
we called it CTX-2; (2) for CTX-1 to CTX-3, because there was a shift of ctxB™! T° 
to ctxBCl| we called it CTX-3; (3) for CTX-3 to CTX-3b, because there was 
only one SNP mutation in ctxBC**“"! from CTX-2 and rest was identical, we called 
it the next variant of CTX-3, which is CTX-3b. 

In summary, if there is a shift of any gene from one biotype to another, the new 
CTX will be called CTX-n: thus the next strains fitting these criteria will be called 
CTX-4. However, if there is a mutation(s) that does not lead to a shift of the gene to 
another biotype gene, CTX-1b, CTX-1c or CTX-2b; CTX-2c or CTX-3b; CTX-3c 
and so on should be followed as appropriate. 
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Broadly neutralizing antibodies against highly variable viral 
pathogens are much sought after to treat or protect against global 
circulating viruses. Here we probed the neutralizing antibody 
repertoires of four human immunodeficiency virus (HIV)-infected 
donors with remarkably broad and potent neutralizing responses 
and rescued 17 new monoclonal antibodies that neutralize broadly 
across clades. Many of the new monoclonal antibodies are almost 
tenfold more potent than the recently described PG9, PG16 and 
VRCO01 broadly neutralizing monoclonal antibodies and 100-fold 
more potent than the original prototype HIV broadly neutralizing 
monoclonal antibodies’’. The monoclonal antibodies largely 
recapitulate the neutralization breadth found in the corresponding 
donor serum and many recognize novel epitopes on envelope (Env) 
glycoprotein gp120, illuminating new targets for vaccine design. 
Analysis of neutralization by the full complement of anti-HIV 
broadly neutralizing monoclonal antibodies now available reveals 
that certain combinations of antibodies should offer markedly 
more favourable coverage of the enormous diversity of global 
circulating viruses than others and these combinations might be 
sought in active or passive immunization regimes. Overall, the 
isolation of multiple HIV broadly neutralizing monoclonal 
antibodies from several donors that, in aggregate, provide broad 
coverage at low concentrations is a highly positive indicator for the 
eventual design of an effective antibody-based HIV vaccine. 

Most successful antiviral vaccines elicit neutralizing antibodies as a 
correlate of protection**. For highly variable viruses—such as HIV, 
hepatitis C virus (HCV) and, to a lesser extent, influenza—vaccine 
design efforts have been hampered by the difficulties associated with 
eliciting neutralizing antibodies that are effective against the enormous 
diversity of global circulating isolates (that is, broadly neutralizing 
antibodies)*’. However, for HIV for example, 10-30% of infected 
individuals do, in fact, develop broadly neutralizing sera and protective 
broadly neutralizing monoclonal antibodies have been isolated from 
infected donors'**"”. It has been suggested that, given the appropriate 
immunogen, it should be possible to elicit these types of responses by 
vaccination”’ and understanding the properties of broadly neutralizing 
monoclonal antibodies has become a major goal in research on highly 
variable viruses. 

We have previously screened sera from approximately 1,800 HIV- 
infected donors for neutralization breadth and potency, designating 
the top 1% as “elite neutralizers’, based on a score incorporating both 
breadth and potency". In this study, we set out to isolate broadly 
neutralizing monoclonal antibodies from the top four elite neutralizers 
(Supplementary Table 1) by screening antibody-containing memory B 
cell supernatants for broad neutralizing activity using a recently 


described high-throughput functional approach’. Antibody variable 
genes were rescued from B-cell cultures that showed cross-clade neut- 
ralizing activity and expressed as full-length IgGs. Analysis of the 
sequences revealed that all of the monoclonal antibodies isolated from 
each individual donor belong to a distant, but clonally related cluster of 
antibodies (Supplementary Table 2). Because it has been proposed that 
antibodies from HIV-infected patients are often polyreactive’*’®, we 
tested the new monoclonal antibodies for binding to a panel of antigens 
and showed that they were not polyreactive (Supplementary Fig. 2). 
The potency and breadth of the monoclonal antibodies were next 
assessed on a 162-pseudovirus panel representing all major circulating 
HIV subtypes (Fig. 1 and Supplementary Tables 3 and 4)’. All of the 
monoclonal antibodies exhibited cross-clade neutralizing activity, but 
more strikingly, several showed exceptional potency. The median 
antibody concentrations required to inhibit HIV activity by 50% or 
90% (ICs and [Cop values), of PGT monoclonal antibodies 121-123 
and 125-128 were almost tenfold lower (that is, more potent) than the 
recently described PG9, PG16, VRCO1 and PGV04 broadly neutral- 
izing monoclonal antibodies’” (E. Falkowska et al., manuscript in 
preparation, X. Wu et al., Science, in the press), and approximately 
100-fold lower than other broadly neutralizing monoclonal antibodies 
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Figure 1 | Neutralization activity of the newly identified PGT antibodies. 
a, Median neutralization potency against viruses neutralized with an 
IC59 < 50 pg ml '. b, Neutralization breadth at different ICs9 cut-offs. 
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described earlier (Fig. 1). At concentrations less than 0.1 1g ml’, these 
monoclonal antibodies still neutralized 27% to 50% of viruses in the 
panel (Fig. 1). Although PGT monoclonal antibodies 135, 136 and 137 
showed a lesser neutralization breadth than the other monoclonal 
antibodies, they all still potently neutralized over 30% of the clade C 
viruses on the panel (Supplementary Fig. 2 and Supplementary 
Table 3b). This result is significant considering that HIV clade C pre- 
dominates in sub-Saharan Africa and accounts for more than 50% of 
all HIV infections worldwide. 

Interestingly, many of the clonally related monoclonal antibodies 
exhibited differing degrees of overall neutralization potency. For 
example, the median ICs values of PGT monoclonal antibodies 131, 
136, 137 and 144 were approximately 10- to 50-fold higher than those 
of their somatically related sister clones (Fig. 1). Also, in some cases, 
the somatically related monoclonal antibodies exhibited similar neut- 
ralization potency but differing degrees of neutralization breadth 
against the panel of viruses tested (Fig. 1 and Supplementary Tables 
3 and 4). For example, PGT 128 neutralized with comparable overall 
potency but significantly greater neutralization breadth than the clon- 
ally related PGT 125, 126 and 127 monoclonal antibodies (Fig. 1 and 
Supplementary Tables 3 and 4). Overall, these observations suggest 
that serum neutralization breadth may develop from the successive 
selection of somatic variants that bind to a modified epitope or a 
slightly different Env conformation expressed on virus escape variants. 
Comparison of the neutralization profiles of the monoclonal antibodies 
isolated from a given donor with that from the corresponding serum 
revealed that the isolated monoclonal antibodies could largely recapi- 
tulate the serum neutralization breadth and potency (Fig. 2 and Sup- 
plementary Fig. 3). 

We next sought to gain information on the epitopes recognized by the 
newly isolated broadly neutralizing monoclonal antibodies. Enzyme- 
linked immunosorbent (ELISA) assays indicated that PGT monoclonal 
antibodies 121-123, 125-128, 130, 131 and 135-137 bound to mono- 
meric gp120 (Supplementary Table 5). In contrast, the PGT 141-145 
broadly neutralizing monoclonal antibodies exhibited a strong pref- 
erence for membrane-bound, trimeric HIV Env (Supplementary 
Fig. 4). On the basis of this result, we postulated that these broadly 
neutralizing monoclonal antibodies bound to quaternary epitopes 
similar to those of the recently described PG9 and PG16 broadly neut- 
ralizing monoclonal antibodies’. Indeed, this hypothesis was confirmed 
by competition studies, N160K sensitivity and, for PGT monoclonal 
antibodies 141-144, an inability to neutralize JR-CSF pseudoviruses 
expressing homogenous Man GlcNAc, glycans’? (Supplementary 
Fig. 5). 
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Figure 2 | Key monoclonal antibodies fully recapitulate serum 
neutralization by the corresponding donor serum. Serum breadth was 
correlated with the breadth of the broadest monoclonal antibody (mAb) for 
each donor (percentage of viruses neutralized at 50% neutralizing titre 
(NTs0) > 100 or ICs9 < 50 pg ml, respectively). Of note, monoclonal 
antibodies isolated from donor 39 could not completely recapitulate the serum 
neutralization breadth. 
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To define the epitopes recognized by the remaining PGT antibodies, 
competition ELISA assays were carried out with a panel of well- 
characterized neutralizing and non-neutralizing antibodies (Fig. 3a). 
Unexpectedly, all of the remaining antibodies (PGT monoclonal 
antibodies 121-123, 125-128, 130, 131 and 135-137) competed with 
the glycan-specific broadly neutralizing monoclonal antibody 2G12. 
This result was surprising given that 2G12 had previously formed its 
own unique competition group. All of the monoclonal antibodies, 
except for PGT monoclonal antibodies 135, 136 and 137, also 
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competed with a V3-loop-specific monoclonal antibody and failed to 
bind to gp120AV3, suggesting that their epitopes were in proximity to 
or contiguous with V3 (Fig. 3a and Supplementary Table 5). 
Deglycosylation of gp120 with Endo H abolished binding by all the 
monoclonal antibodies, indicating that certain oligomannose glycans 
were important for epitope recognition (Supplementary Table 5). 
Competition of these monoclonal antibodies with 2G12 and lack of 
binding to deglycosylated gp120 prompted us to investigate whether 
these antibodies contacted glycans directly. Glycan array analysis 
revealed that PGT monoclonal antibodies 125-128 and 130 bound 
specifically to both MangGlcNAc, and MangGlcNAc,, whereas the 
remaining antibodies showed no detectable binding to high-mannose 
glycans (Fig. 3b). Interestingly, binding of PGT monoclonal antibodies 
125-128 and 130 to gp120 was competed by Mang but, unlike 2G12, 
was not competed by monomeric mannose or Man, (D1 arm of 
MangGlcNAc,) (Fig. 3c, d), suggesting a different mode of glycan recog- 
nition. Furthermore, in contrast to 2G12, no evidence was found for 
domain exchange and monomeric Fab fragments still exhibited potent 
neutralizing activity (Supplementary Fig. 7 and data not shown). 

To define further the epitopes recognized by the monoclonal 
antibodies, neutralizing activity against a large panel of HIV-1yr-csr 
variants incorporating single alanine substitutions was assessed using a 
single round of replication pseudovirus assay (Supplementary Table 6). 
In the panel of mutants, the N-linked glycans at positions 332 and/or 
301 were important for neutralization by PGT monoclonal antibodies 
125-128, 130 and 131, suggesting their direct involvement in epitope 
formation. The apparent dependency on so few glycans indicates that, 
although these PGT monoclonal antibodies contact Mang _gGlcNAc, 
glycans directly, their arrangement in the context of gp120 is critical for 
high-affinity glycan recognition and neutralization potency. This is 
further highlighted by the inability of the PGT monoclonal antibodies 
to neutralize simian immunodeficiency virus (SIV) strain SIVaco39; 
HIV-2 or HCV, which show a high level of glycosylation (data not 
shown). Interestingly, although PGT monoclonal antibodies 121-123 
failed to exhibit detectable binding to high-mannose glycans and be 
competed by mannose sugars (Supplementary Fig. 6), the only sub- 
stitutions that completely abolished neutralization by these monoclo- 
nal antibodies were those that resulted in removal of the glycan at 
position 332. Although structural studies will be required to fully define 
the epitopes recognized by these antibodies, the above results indicate 
either that the PGT monoclonal antibodies 121-123 bind to a protein 
epitope along the gp120 polypeptide backbone that is conformationally 
dependent on the N332 glycan or that the glycan contributes more 
strongly to binding in the context of the intact protein. 

Vaccines against pathogens with low antigenic diversity, such as 
hepatitis B virus or measles, commonly achieve 90-95% efficacy’®. 
Similarly, the influenza vaccine achieves 85-90% efficacy in years 


Figure 3 | Epitope mapping of PGT antibodies. a, Competition of PGT 
monoclonal antibodies with sCD4 (soluble CD4), b12 (anti-CD4 binding site), 
2G12 (anti-glycan), F425/b4e8 (anti-V3), X5 (CD4-induced), PG9 (anti-V1/V2 
and V3, quaternary) and each other. Competition assays were performed by 
ELISA using gp120g.1 or gp120jr p1, except for the PG9 competition assay, 
which was performed on the surface of JR-FLgi6sx or JR-CSF transfected cells. 
Boxes are coded as follows: +++, 75-100% competition; + +, 50-75% 
competition; +, 25-50% competition; —, <<25% competition. Experiments 
were performed in duplicate, and data represent an average of at least two 
independent experiments. b, Glycan microarray analysis (Consortium for 
Functional Glycomics (CFG), version 5.0) reveals that PGT monoclonal 
antibodies 125, 126, 127, 128 and 130 contact Mang (313), MangGlcNAc; (193), 
Mang (314) and MangGlcNAc; (194) glycans directly. Only glycan structures 
with RFU (relative fluorescent units) > 3,000 are shown. PGT-131 showed no 
detectable binding to the CFG glycan array but bound to Mano-oligodendrons” 
(data not shown). Error bars represent standard deviation. c, d, Binding of PGT 
monoclonal antibodies 125, 126, 127, 128 and 130 to gp120 is competed by 
Man, oligodendrons but not Man, oligodendrons. Binding of 131 to 
immobilized gp120 was too low to measure any competition. Error bars 
represent standard error of the mean. 
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when the vaccine and circulating seasonal strain are well matched’”””. 
However, efficacy drops severely in years when there is a mismatch 
between the vaccine and circulating strain. In the case of HIV, the 
global diversity of circulating viruses is such that the match between 
the prophylactic antibodies and the circulating viruses—that is, the 
antibody viral coverage—will be crucial for the degree of efficacy of 
active or passive prophylaxis approaches. As yet, although the recent 
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Figure 4 | Certain antibodies or antibody combinations are able to cover a 
broad range of HIV isolates at low, vaccine-achievable concentrations. 

a, Cumulative frequency distribution of ICs9 values of broadly neutralizing 
monoclonal antibodies tested against a 162-virus panel. The y-axis shows the 
cumulative frequency of IC;9 values up to the concentration shown on the 
x-axis and can therefore also be interpreted as the breadth at a specific ICso cut- 
off. b, c, Percentage of viruses covered by single monoclonal antibodies (solid 
lines) or by at least one of the monoclonal antibodies in dual combinations of 
breadth (dashed black lines) dependent on individual concentrations. The grey 
area in both panels is the coverage of 26 monoclonal antibodies tested on the 
162-virus panel (PGT121-123, PGT125-128, PGT130-131, PGT135-137, 
PGT141-145, PG9, PG16, PGC14, VRCO1, PGV04, b12, 2G12, 4E10, 2F5) and 
depicts the theoretical maximal achievable coverage known to date. 
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RV 144 trial has led to speculation that some degree of protection 
against HIV may be achieved through extra-neutralizing activities of 
antibodies, such as antibody-dependent cell-mediated cytotoxicity or 
phagocytosis, the strongest evidence for protection is for neutralizing 
antibodies in non-human primate models using simian-human 
immunodeficiency virus (SHIV) challenge’. Passive administration 
of neutralizing antibodies in these models suggests that a serum antibody 
concentration of approximately or greater than 100 times the in vitro 
pseudovirus assay ICs is required to achieve a meaningful level of pro- 
tection’ *°. Therefore, if a vaccine elicits a serum broadly neutralizing 
antibody concentration on the order of 10 1g ml’ (ref. 26) and if an 
ICso:protective-serum-concentration ratio of 1:100 is assumed, then 
protection would be only achieved against viruses for which the broadly 
neutralizing antibody ICso is lower than 0.1 pgml '. As a second more 
conservative scenario, for an IC;9:protective-serum-concentration ratio 
of 1:500, protection would be achieved against viruses for which the 
broadly neutralizing antibody ICso is lower than 0.02 pg ml’. As shown 
in Fig. 4, although various broadly neutralizing monoclonal antibodies 
show breadth at high concentrations, viral coverage often drops sharply 
at lower concentrations. Therefore, if elicited or delivered singly, only the 
most potent antibodies, such as 121 and 128, would be able to achieve a 
meaningful level of viral coverage, in particular at concentrations cor- 
responding to the more conservative scenario given above. As broadly 
neutralizing monoclonal antibodies show different and in some cases 
complementary breadth, we further looked at the theoretical coverage 
achieved by antibody combinations. For the two ICso:protective-serum- 
concentration ratios above, a combination of PGV04 and VRCO1, the 
two most potent CD4 binding site broadly neutralizing monoclonal 
antibodies, would provide protection against 29% and 2% of viruses, 
respectively (Fig. 4b). In contrast, for a vaccine eliciting antibodies with 
high potency and favourable non-overlapping breadth, such as 128 and 
145, coverage would be achieved against 63% and 40% of viruses for the 
two scenarios (Fig. 4c). Several combinations of two broadly neutralizing 
monoclonal antibodies, including those directed to overlapping epi- 
topes, can yield this degree of coverage (Supplementary Fig. 8). In addi- 
tion, a combination of all of the broadly neutralizing monoclonal 
antibodies would cover 89% and 62% of viruses, correspondingly. 
Coverage against such a large proportion of viruses would probably have 
an important impact on the pandemic. 

An effective vaccine against HIV will probably require the elicitation 
of a combination of complementary potent neutralizing antibodies. 
The demonstration that large numbers of potent and diverse broadly 
neutralizing monoclonal antibodies can be isolated from several dif- 
ferent individuals provides grounds for renewed optimism that an 
antibody-based vaccine may be achievable. 


METHODS SUMMARY 


Activated memory B-cell supernatants were screened in a high-throughput format 
for neutralization activity using a micro-neutralization assay, as described’. Heavy- 
and light-chain variable regions were isolated from B-cell lysates of selected 
neutralizing hits by reverse transcription from RNA followed by multiplex PCR 
amplification using family-specific V-gene primer sets. For some antibodies, 
traditional cloning methods were used for antibody isolation, as described’. For 
other antibodies, amplicons from each lysate were uniquely tagged with multiplex 
identifier (MID) sequences and 454 sequencing regions (Roche). Single rounds of 
replication pseudovirus neutralization assays and cell surface binding assays were 
performed as described previously*”’’*. Glycan reactivities were profiled on a 
printed glycan microarray (version 5.0 from the Consortium for Functional 
Glycomics) as described previously”. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Antibodies and antigens. The following antibodies and reagents were procured 
by the IAVI Neutralizing Antibody Consortium: antibody 2G12 (Polymun 
Scientific), antibody F425/b4E8 (provided by L. Cavacini, Beth Israel Deaconess 
Medical Center), soluble CD4 (Progenics), HxB2 gp120, SF162 gp120, BaL gp120, 
JR-FL gp120, JR-CSF gp120 and YU2 gp120 (provided by G. Stewart-Jones, 
Oxford University). Purified ADA gp120 was produced in the laboratory of R. 
Doms, University of Pennsylvania. Fab X5 was expressed in Escherichia coli and 
purified using an anti-human Fab specific affinity column. Deglycosylated gp120 
JREL was expressed in HEK 293S GnTI ‘~ cells and treated with Endo H (Roche). 
Donors. The donors identified for this study were selected from the AVI sponsored 
study, Protocol G™. Eligibility for enrolment into Protocol G was defined as: male or 
female at least 18 years of age with documented HIV infection for at least three years, 
clinically asymptomatic at the time of enrolment and not currently receiving 
antiretroviral therapy. Selection of individuals for monoclonal antibody genera- 
tion was based on a rank-order high-throughput screening and analytical algo- 
rithm”. Volunteers were identified as elite neutralizers based on broad and potent 
neutralizing activity against a cross-clade pseudovirus panel"*. 

Isolation of monoclonal antibodies. The method for isolating human monoclo- 
nal antibodies from memory B cells in circulation has previously been described’. 
Surface IgG* B cells seeded at near-clonal density in 384-well microplates were 
activated in short-term culture. Supernatants were screened for neutralization 
activity against 2-4 pseudotyped viruses for which neutralization activity was 
detected at high titres in the donor serum. Heavy- and light-chain variable regions 
were isolated from B-cell lysates of selected neutralizing hits by reverse transcrip- 
tion from RNA followed by multiplex PCR amplification using family-specific 
V-gene primer sets. Amplicons from each lysate were uniquely tagged with multiplex 
identifier (MID) sequences and 454 sequencing regions (Roche). A normalized 
pooling of gamma, kappa and lambda chains was performed based on agarose gel 
image quantitation and the pool was analysed by 454 Titanium sequencing. 
Consensus sequences of the VH and VL chains were generated using the 
Amplicon Variant Analyser (Roche) and assigned to specific B-cell culture wells 
by decoding the MID tags. Selected VH and VL chains were synthesized and cloned 
in expression vectors with the appropriate IgG1, Ig kappa or Ig lambda constant 
domain. Monoclonal antibodies were reconstituted by transient transfection in 
HEK293 cells followed by purification from serum-free culture supernatants. 

PGT antibody expression and purification. Antibody genes were cloned into an 
expression vector and transiently expressed with the FreeStyle 293 Expression 
System (Invitrogen). Antibodies were purified using affinity chromatography 
(Protein A Sepharose Fast Flow, GE Healthcare) and purity and integrity checked 
with SDS-PAGE. 

Neutralization assays. Neutralization by monoclonal antibodies and donor sera 
was performed by Monogram Biosciences using a single round of replication 
pseudovirus assay as previously described*’. Briefly, pseudoviruses capable of a 
single round of infection were produced by co-transfection of HEK293 cells with a 
subgenomic plasmid, pHIV-1lucu3, that incorporates a firefly luciferase indicator 
gene and a second plasmid, pCXAS, which expressed HIV-1 Env libraries or 
clones. Following transfection, pseudoviruses were harvested and used to infect 
U87 cell lines expressing co-receptors CCR5 or CXCR4. Pseudovirus neutraliza- 
tion assays using HIV-1JR-CSF alanine mutants are fully described elsewhere’. 
Neutralization activity of monoclonal antibodies against HIV-1JR-CSF alanine 
mutants was measured using a TZM-BL assay, as described’. Kifunensine-treated 
pseudoviruses were produced by treating 293T cells with 25 11M kifunensine on the 
day of transfection. Memory B-cell supernatants were screened in a micro- 
neutralization assay against a cross-clade panel of HIV-1 isolates and 
SIVmaca39 (negative control). This assay was based on the 96-well pseudotyped 
HIV-1 neutralization assay (Monogram Biosciences) and was modified for screen- 
ing 15 ul of B-cell culture supernatants in a 384-well format. 

Cell surface binding assays. Titrating amounts of antibodies were added to HIV-1 
Eny-transfected 293T cells, incubated for 1 h at 37 °C, washed with FACS buffer, 
and stained with goat anti-human IgG F(ab’), conjugated to phycoerythin 
(Jackson ImmunoResearch). Binding was analysed using flow cytometry, and 
binding curves were generated by plotting the mean fluorescence intensity of 
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antigen binding as a function of antibody concentration. For competition assays, 
titrating amounts of competitor antibodies were added to the cells 30 min before 
adding biotinylated PGT monoclonal antibodies at a concentration required to 
give half-maximum effective concentration (ECs9). 

ELISA assays. For antigen-binding ELISAs, serial dilutions of monoclonal 
antibodies were added to antigen-coated wells and binding was probed with 
alkaline-phosphatase-conjugated goat anti-human IgG F(ab’), antibody 
(Pierce). For competition ELISAs, titrating amounts of competitor monoclonal 
antibodies were added to gp120-coated ELISA wells and incubated for 30 min 
before adding biotinylated PGT monoclonal antibodies at a concentration 
required to give IC;9. Biotinylated PGT monoclonal antibodies were detected 
using alkaline-phosphatase-conjugated streptavidin (Pierce) and visualized using 
p-nitrophenol phosphate substrate (Sigma). 

Glycan microarray analysis. Monoclonal antibodies were screened on a printed 
glycan microarray version 5.0 from the CFG as described previously”. Antibodies 
were used at a concentration of 30 1g ml’ and were precomplexed with 15 1g 
ml! secondary antibody (goat anti-human-Fc-rPE, Jackson Immunoresearch) 
before addition to the slide. Complete glycan array data sets for all antibodies can 
be found at http://www.functionalglycomics.org in the CFG data archive under 
“cfg_rRequest_2250”. 

Oligomannose dendron synthesis. The oligomannose dendrons (Man,D and 
Man,D) were synthesized by Cu(I) catalysed alkyne-azide cycloaddition between 
azido oligomannose and the second generation of AB; type alkynyl dendron. 
Detailed procedures and characterization were previously reported”. 
Fabrication of gp120 microarray. NHS-activated glass slides (Nexterion slide H, 
Schott North American) were printed with robotic pin (Arrayit 946) to deposit 
gp120 JREL at concentrations of 750 or 250 pg ml * in printing buffer (120 mM 
phosphate, pH 8.5; containing 5% glycerol and 0.01% Tween 20). Twelve replicates 
were used for each concentration. The printed slides were incubated in relative 
humidity 75% chamber overnight and treated with blocking solution (superblock 
blocking buffer in PBS, Thermo) at 25 °C for 1 h. The slides were then rinsed with 
PBS-T (0.05% Tween 20) and PBS buffer, and centrifuged at 200g to remove 
residual solution from slide surface. 

Oligomannose dendron-gp120 competition assay with monoclonal antibodies. 
Serial diluted oligomannose dendrons were mixed with monoclonal antibody 
(40 pg ml ') in PBS-BT buffer (1% BSA and 0.05% Tween 20 in PBS). The mix- 
tures were applied directly to each sub-array on slide. After incubation in a humidi- 
fied chamber for 1 h at 25 °C, the slides were rinsed sequentially with PBS-T (0.05% 
Tween 20 in PBS) and PBS buffer, and then centrifuged at 200g. Each sub-array was 
then stained with Cy3-labelled goat anti-human Fc IgG (7.5 pg ml’ in PBS-BT) 
for 1 h ina humidified chamber. The slides were then rinsed sequentially with PBS-T 
and deionized water and centrifuged at 200g. The fluorescence of the final arrays was 
imaged at 10jm resolution (excitation: 540nm; emission: 595nm) with an 
ArrayWorx microarray reader (Applied Precision). 

Sequence analysis. Germline genes were predicted using the immunoglobulin 
sequence alignment tools IMGT/V-QUEST” and SoDA2*’. Clonally related 
sequences were identified by common germline V-genes and long stretches of 
identical N-nucleotides. 

Statistics. Statistical analyses were done with Prism 5.0 for Mac (GraphPad). Viruses 
that are not neutralized at an ICs or [Cog < 50 ug ml? were given a value of 50 1g 
ml! for median calculations. For combinations of antibodies, a virus was counted as 
covered if at least one of the monoclonal antibodies was neutralized depending on 
individual concentrations (IC;9). This approach does not take additivity into account 
and therefore underestimates the neutralization potency of antibody combinations. 
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Synthetic chromosome arms function in yeast and 
generate phenotypic diversity by design 


Jessica S. Dymond'’+, Sarah M. Richardson'?, Candice E. Coombes!?, Timothy Babatz?, Héloise Muller’, Narayana Annaluru’, 
William J. Blake°+, J oy W. Schwerzmann‘}, Junbiao Dai’, Derek L. Lindstrom®+, Annabel C. Boeke!+, Daniel E. Gottschling®, 


Srinivasan Chandrasegaran’, Joel S. Bader!” & Jef D. Boeke? 


Recent advances in DNA synthesis technology have enabled the con- 
struction of novel genetic pathways and genomic elements, further- 
ing our understanding of system-level phenomena’ ’. The ability to 
synthesize large segments of DNA allows the engineering of path- 
ways and genomes according to arbitrary sets of design principles. 
Here we describe a synthetic yeast genome project, Sc2.0, and the 
first partially synthetic eukaryotic chromosomes, Saccharomyces 
cerevisiae chromosome synIXR, and semi-synVIL. We defined three 
design principles for a synthetic genome as follows: first, it should 
result in a (near) wild-type phenotype and fitness; second, it should 
lack destabilizing elements such as tRNA genes or transposons*”; 
and third, it should have genetic flexibility to facilitate future studies. 
The synthetic genome features several systemic modifications com- 
plying with the design principles, including an inducible evolution 
system, SCRaMbLE (synthetic chromosome rearrangement and 
modification by loxP-mediated evolution). We show the utility of 
SCRaMDbLE as a novel method of combinatorial mutagenesis, 
capable of generating complex genotypes and a broad variety of 
phenotypes. When complete, the fully synthetic genome will allow 
massive restructuring of the yeast genome, and may open the door to 
a new type of combinatorial genetics based entirely on variations in 
gene content and copy number. 

The first phase of any genome engineering project is design 
(Supplementary Text 1). We designed the right arm of chromosome 
IX (IXR) according to the three principles outlined above and in Box 1. 
IXR is the smallest chromosome arm in the genome and features several 
genomic elements of interest (Fig. la), making it suitable for a pilot 
study. The designed sequence, synIXR, is based on a native IXR 
sequence extending from open reading frame (ORF) YILO02W through 
the centromere and the remainder of chromosome LXR, an 89,299-base- 
pair (bp) sequence (native [XR position 350,585-438,993 (ref. 10)). In 
accordance with the second design principle, a transfer RNA gene, a Ty1 
long terminal repeat (LTR), and telomeric sequences were removed. The 
final synIXR sequence, 91,010 bp, is slightly longer than the native 
sequence owing to the inclusion of 43 loxPsym sites, and it replaces 
20.3% of the native chromosome. A 30-kilobase (kb) telomeric segment 
of the left arm of chromosome VI (semi-synVIL) was similarly designed 
(Fig. 1b and Supplementary Text 2), and replaced 15.7% of the native 
chromosome. Of the original sequence lengths, 17% was changed by 
base substitution, deleted, or inserted during design of the two synthetic 
segments (Supplementary Table 1). Sequences were submitted to 
GenBank (sequences synIXR:JN020955 and semi-synVIL:JN020956 
are also available in Supplementary Information). 


We systematically introduced two sets of changes in silico using 
the genome editing suite BioStudio (S.MLR., J.S.D., J.D.B. and J.S.B., 
unpublished data): TAG/TAA stop-codon swaps and PCRTag 
sequences (see Supplementary Text 1). In recognition of the third 
design principle, the elimination of the TAG stop codon by recoding 
to TAA frees a codon for future expansion of the genetic code (for 
example, by adding a twenty-first, unnatural amino acid’’””), and 
could serve as a future mechanism of reproductive isolation and con- 
trol. PCRTags are short pairs of recoded sequences, unique to either 
the wild-type or synthetic genome. They serve as convenient, low-cost, 
closely spaced genetic markers for verifying the introduction of syn- 
thetic sequence and the removal of native sequence by allowing the 
design of PCR primers for rapid evaluation of the presence of synthetic 
sequences and absence of native sequences. This is critical for evalu- 
ating the incorporation of synthetic DNA (see below and Sup- 
plementary Text 2). PCRTags, designed in silico, were tested in trip- 
licate to verify specificity (Supplementary Fig. 1 and Supplementary 
Tables 2 and 3). 

LoxPsym sequences are nondirectional loxP sites that are capable of 
recombining in either orientation’*. Theoretically, they produce inver- 
sions or deletions with equal probability. Under the third design 
principle, these sites form the substrate for the inducible SCRaMbLE 
system and are intended to generate combinatorial diversity. We 
inserted loxPsym sites 3 bp after the stop codon of each nonessential 
gene and at major landmarks, such as sites of LTR and tRNA deletions, 
flanking the centromere CEN9, and adjacent to telomeres (Fig. 1 and 
Supplementary Text 1). LoxPsym sites inserted at equivalent positions 
genome-wide will allow the formation of many structurally distinct 
genomes. 

After completion of chromosome design and construction, “arm- 
swap’ strains, wherein the wild-type sequence was replaced with syn- 
thetic sequence, were generated. The synIXR chromosome, cloned in a 
circular bacterial artificial chromosome (BAC) vector, includes all 
sequences needed for propagation in yeast and bacteria (Fig. la). We 
introduced synIXR into a diploid strain by transformation (Fig. 2a); 
typically, about 10-15% of the synIXR transformants obtained were 
positive for all PCRTag pairs tested (Fig. 2d). We chose one such 
transformant, strain A (Fig. 2a), and truncated one native IXR homo- 
logue (IXAR) by transforming with a suitably designed linear DNA 
fragment", introducing a selectable marker (URA3) and a telomere 
seed sequence, generating strain C (Fig. 2b). Chromosome truncation 
was confirmed by pulsed-field gel electrophoresis analysis (Fig. 2c), 
and strain C was sporulated to generate haploids carrying synIXR and 
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Figure 1 | Maps of synIXR and semi-synVIL. Boxed text indicates elements 
deleted in the synthetic chromosomes. Vertical green bars inside ORFs indicate 
PCRTag amplicons; only sequences at the outside edges of these are recoded. 


IXAR. We observed more spore lethality than in control crosses, pre- 
sumably owing to segregation of syn[XR away from IXAR; cells bear- 
ing only synIXR or only IXAR would lack many essential genes and 
would not survive. PCRTag analysis of 14 syn [XR candidate arm-swap 
strains revealed ten haploids with all synthetic PCRTags and no native 
PCRTags present (Fig. 2d and Supplementary Fig. 2). The remaining 
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ARS, autonomously replicating sequence. a, Syn XR. Vector is circular. 
b, Semi-synVIL. 


four strains carried BACs with patchworks of synthetic and native 
sequences indicative of meiotic gene-conversion events (Supplemen- 
tary Fig. 2). Sanger sequencing and structural analyses (Supplementary 
Fig. 3, Supplementary Table 4 and Supplementary Text 3) of recovered 
synIXR BACs revealed that no mutations had occurred in the synthetic 
chromosome. Thus, the synthetic sequence is replicated faithfully. 
Whereas synIXR was incorporated in a circular form, we used an 
alternate strategy to integrate the semi-synVIL chromosome fragment 
into native chromosome VI (Supplementary Fig. 4): a linear synthetic 
fragment marked with LEU2 was transformed into a YFLO54C::kanMX 
strain. Approximately 13% of transformants (75 of 586) had the 
Leu’ G418° phenotype expected for the desired integrant. PCRTag 
analysis showed that 10 of 12 such strains contained only synthetic 


Native IX——o— IXAR———oU SynIXR Bac(t) 3885 
(Ke) 291 PCRTags, as expected for full replacement (Supplementary Fig. 5). 
b —350kb 360 kb 194 The first design principle prioritizes a wild-type phenotype and a 
a Stars o74 high level of fitness despite the incorporated modifications. SynIXR 
et has a designed sequence alteration approximately every 500 bp, 2.64% 
eee of total sequence is altered, and it carries 43 loxPsym sites. To check for 
d en saa) negative effects of modifications on fitness, we examined colony size 
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and morphology under various conditions, and also performed tran- 
script profiling. We inspected colony size and morphology of synIXR 
swap strains under six distinct growth conditions. It was impossible to 


Figure 2 | Strain construction and verification. a, Generation of synIXR 
haploids. The synIXR BAC (L) was transformed into the wild-type strain 
BY4743 (WT, step I) to generate strain A (step II). One copy of native IXRin A 
was replaced with a URA3-telomere seed cassette (U), generating IXAR in 
strain B (step III). B was sporulated to produce haploids (step IV). Circle, 
centromere; small square, LEU2 gene. b, Structure of IXAR. c, Electrophoretic 
karyotype (top panel) and Southern blot of NotI digest (bottom panel) of the 
wild-type, strain A, strain B and synIXR-1D genomes. Linearized synIXR 
migrates as a discrete band of ~100kb. The probe (YIL002C) detects all 
isoforms of chromosome IX. *, native IXR; **, IXAR. d, PCRTag analysis. SYN, 
synIXR BAC; V, vector amplicon. 
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Figure 3 | Transcript profiling of wild-type and synIXR strains. Transcript profiling of synIXR-1D, -6B, and -22D. The log, ratio of RNA abundance relative to 
wild type (BY4741 or BY4742) is shown. YIL002C and YILO01W (blue) exist in two copies. Essential genes are labelled in red. Error bars, s.d. 


distinguish swap strains from the wild type (BY4741) under these 
conditions, indicating that any fitness defect attributable to synIXR 
is modest; fitness tests on semi-synVIL gave similar results (Sup- 
plementary Fig. 6). 

Synonymous substitutions, introduction of loxPsym sites or other 
changes might change gene expression. We performed transcript pro- 
filing on the swap strains synIXR-1D, synIXR-6B, and synIXR-22D 
(Supplementary Text 4); these studies revealed notable but predictable 
trends (Fig. 3). As expected, genes present in two copies (YIL001W and 
YILO02C, present on both synIXR and IXAR) were approximately 
doubled in transcript abundance. Most genes showed no substantial 
expression change, although a few showed modest decreases; however, 
the subtelomeric genes YIRO39C and YIRO42C showed increased 
expression. We speculate that in the circular synthetic chromosome, 
these are released from telomeric silencing, resulting in their over- 
expression. Overall, synIXR genes show relatively normal expression, 
indicating that loxPsym sites and PCRTags affect expression only 
minimally. Similarly, no substantial changes were observed by RNA 
blotting (Supplementary Fig. 7a). To detect possible compensatory 
transcriptome changes, we profiled transcripts genome-wide. Except 
for trivial differences attributable to slightly different configurations of 
selectable markers in the strains, there were no consistent, statistically 
significant differences outside IXR itself (Supplementary Fig. 7b). 
Thus, modifications present in synIXR and semi-synVIL do not pro- 
duce major fitness effects or compensatory transcriptomic alterations. 

A central feature of the synthetic yeast genome is the incorporated 
conditional genome instability system, SCRaMbLE. The design prin- 
ciples dictate that SCRaMDbLE should be available for use on demand, 
yet should lie dormant until intentional Cre recombinase induction, at 
which point generation of genetic diversity is desirable. To complete 
the SCRaMDLE toolkit, we incorporated an engineered Cre recombi- 
nase fused to the murine oestrogen binding domain (EBD). This 
recently described Cre-EBD variant’® is oestradiol-inducible, has low 
basal activity and is controlled by the daughter-cell-specific promoter 
SCW11 (Supplementary Fig. 8). The plasmid pSCW11-Cre-EBD 
should produce a pulse of recombinase activity once and only once 
in each cell’s lifetime, and should depend on oestradiol exposure. The 
uninduced, integrated construct is well tolerated even in swap strains, 
which, with 43 loxPsym sites, are expected to be Cre-hypersensitive. 
Upon oestradiol addition, rearrangements were induced at the 
loxPsym sites and viability dropped by 100-fold in synIXR strains 
(Fig. 4a and Supplementary Fig. 9). This loss of viability probably 
results from loss of synIXR essential genes. In contrast, viability in 
semi-synVIL, which lacks essential genes, is not affected by Cre induc- 
tion (Fig. 1b and Supplementary Fig. 9d). 


Semi-synVIL contains just five loxPsym sites, including one 
immediately adjacent to the telomeric TG,_3 repeats (Fig. 1b). This 
simple configuration allows comprehensive PCR-based mapping of 
rearrangements of four of the loxPsym sites in SCRaMbLEd strains. 
A SCRaMbLEd semi-synVIL population was analysed by PCR for 
most of the possible rearranged configurations, revealing a large 
variety of deletions and inversions (Fig. 4b); most predicted rearrange- 
ments were readily detected. 

The symmetry of loxPsym sites allows alignment in two orienta- 
tions, theoretically giving rise to deletions and inversions with equal 
frequency. SynIXR contains 43loxPsym sites, allowing more than 
3,600 potential pairwise interactions between synIXR loxPsym sites. 
We reasoned that SCRaMbLEd synIXR clones should display high 
phenotypic diversity. Indeed, SCRaMbLEd swap strains show more 
growth-rate heterogeneity than wild-type controls (Fig. 4c and Sup- 
plementary Fig. 10). These SCRaMbLEd clones show many different 
phenotypes (Supplementary Fig. 11 and Supplementary Text 5). In 
summary, SCRaMDLE is sufficient to generate substantial genetic 
heterogeneity and complex phenotypes. 

To characterize the utility of SCRaMbLE further, we performed a 
mutagenesis study. SynIXR encodes both MET28 and LYSI, genes 
required for biosynthesis of amino acids'*’’. Null mutants result in 
auxotrophy, and can be detected easily by replica-plating. We intro- 
duced episomal Cre-EBD (pSCW11-Cre-EBD-URA3MxX cloned in a 
CEN plasmid) into strain C that was previously made LYS2™ (strain D, 
yJS587), and performed SCRaMDbLE. We screened 20,242 colonies and 
3% (604 of 20,242) were candidate lys1 and/or met28 auxotrophs. Of 
360 candidates tested more rigorously, 295 (81.9%) were confirmed: 
we found 212 Lys” auxotrophs (1.37%), 66 Met” auxotrophs (0.43%) 
and, notably, 17 Lys Met double auxotrophs (0.11%). PCRTag pro- 
files of 24 Met” auxotrophs, 35 Lys” auxotrophs and seven double 
auxotrophs (Fig. 4d) showed that all Met” auxotrophs had deletions 
in the loxPsym-flanked segment containing MET28 and YAP5, whereas 
all Lys auxotrophs had deletions in the loxPsym-flanked segment con- 
taining LYSI. The deletion profiles of many SCRaMbLEd auxotrophs 
were highly variable and more than one segment was often missing. 

Toconfirm that the observed SCRaMbLE phenotypes resulted solely 
from deletions in synIXR, we recovered the synIXR chromosomes 
from two Met auxotrophs into Escherichia coli, and then introduced 
them to a clean genetic background. In both cases, the auxotrophic 
phenotype was associated with the presence of the SCRaMbLEd chro- 
mosomes (Supplementary Fig. 12 and Supplementary Text 6). Thus, 
the SCRaMDLE system is a highly effective method of mutagenesis, 
giving rise to mutants with different genetic backgrounds and generat- 
ing a wide variety of double mutants. 
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Figure 4 | SCRaMDbLE rearranges genomes. a, Cre induction reduces the 
fitness of the synIXR strain (SYN) but not the wild type (WT; BY4741). EST, 
oestradiol; time, oestradiol exposure time. b, PCR analysis of semi-syn VIL 
SCRaMbLE. The map shows primer positions. Amplicon 13 is spurious (wrong 
size). SCR, SCRaMbLE. c, Shifted colony-size distribution in SCRaMbLE 
survivors (wild type and the swap strain synIXR-1D). d, PCRTag analysis of 


We have shown there does not seem to be any major theoretical 
impediment to extending the design strategy outlined here to the entire 
yeast genome, apart from the challenge of 12-megabase DNA syn- 
thesis. Whether or not fitness defects will accumulate as design and 
synthesis are scaled up remains to be seen; however, the overall high 
fitness of the swap strains described here validates the design strategy. 
Furthermore, the iterative, bottom-up approach will allow identifica- 
tion of potential ‘problem regions’ in synthetic sequences as synthesis 
moves forward. If a given swap experiment results in only transfor- 
mants with reduced fitness (or if no transformants are obtainable), 
then the underlying defect can be mapped by introducing sub- 
segments, facilitated by strategic placement of unique restriction sites 
throughout synthetic chromosome arms. Also, because a subset of 
transformants consist of patchworks of native and synthetic sequence 
(Supplementary Figs 2 and 5), analysis of such strains can be used to 
map phenotypic defects rapidly. The stability and sequence fidelity of 
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Met (red), Lys (blue) and Met Lys (green) auxotrophs using PCRTags. 
PCRTag pairs are numbered for each column (see Supplementary Table 2); 
MET28, pair 25; LYS1, pair 45. Each row represents one clone. Shaded boxes 
indicate presumed deletions. Panels a—c show strains with integrated Cre-EBD; 
d shows episomal Cre-EBD. 


large circular chromosomes seen here and elsewhere*”’ bode well for 
the use of yeast as a host platform for synthetic biology. 

SCRaMbLE may become a useful general strategy for analysing 
genome structure, content and function. One important feature of 
SCRaMbLE is its potential for customization: expression of different 
Cre-EBD variants from various promoters at distinct levels of inducer 
(oestradiol) should produce distinct SCRaMbLE dynamics. Use of 
weaker promoters than pSCW 11, use of promoters expressed at differ- 
ent phases of the cell cycle, performing SCRaMbLE in diploids, and 
lowering the inducer concentration should all contribute to decreased 
lethality of SCRaMDLE strains, an important consideration as addi- 
tional segments of the genome are replaced with synthetic counterparts 
and the proportion of essential genes that can be lost by SCRaMbLEing 
increases. As shown here, SCRaMbLE mutagenesis is efficient and 
generates mutants with a wide variety of different genetic backgrounds. 
It is possible that different combinations of gene deletions will give rise 
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BOX | 
Modifications in synthetic sequence 


Elements removed 

Retrotransposons: The S. cerevisiae genome contains both active retrotransposons and retrotransposon-derived sequences. These highly repetitive 
sequences are known to contribute to genome instability*. Because retrotransposons are presumed to be nonessential in yeast, we are eliminating 
these sequences from the synthetic genome. 

Subtelomeric repeats: Two major types of subtelomeric repeats, Y’ and X elements, reside in the genome. Y’ elements are of unknown function, and 
are present at some, but not all, S. cerevisiae chromosome ends?°. In contrast, X elements are present in a single copy at all S. cerevisiae chromosome 
ends; they are more highly divergent, and function in telomeric silencing and possibly in chromosome segregation”’. To create a more streamlined 
genome, all Y’ elements will be deleted from the synthetic genome; extant X elements will be replaced with the consensus core X-element sequence, as 
in semi-synVIL. 

Introns: The yeast genome is estimated to contain approximately 285 introns. Based on a previous intron-deletion study we do not anticipate that 
removal of introns will result in fitness defects; however, in some cases these introns house small non-coding RNAs (SnoRNAs) that can be expressed 
ectopically in the synthetic yeast. 

Elements relocated to extrachromosomal array 

tRNA genes: tRNA genes (tDNAs) are highly redundant, with 275 nuclear tDNAs encoding only 42 tRNA species”°. In addition, these genes are 
known regions of genome instability®®. They will therefore be relocated to a dedicated chromosome to contain any instability resulting from their 
presence. 

Elements replaced 

TAG stop codons replaced by TAA: Removal of the TAG stop codon from the synthetic genome will allow future genetic code manipulation. The ‘free’ 
codon may be used to incorporate artificial amino acids'!!*; alternatively, the TAG codon may be placed in essential genes, and, exploiting an 
engineered orthogonal synthetase/tRNA pair, specify a non-genetically encoded amino acid, thereby providing a mechanism of reproductive isolation 
and an additional level of control over the synthetic yeast. 

Individual synonymous codons: The synthetic genome is fabricated in fragments as small as 750 bp2°. Unique restriction sites are necessary within 
the synthetic fragment to facilitate construction of these building blocks into large contigs of up to 100 kb. Short stretches of fewer than four codons 
may therefore be synonymously recoded to introduce or eliminate restriction sites. 

Strings of synonymous codons: Although several modifications exist between the native and synthetic genomes, the presence of a dedicated 
mechanism to distinguish between the two sequence types is invaluable. Short stretches of fewer than ten codons are therefore recoded to generate 
‘PCRTags’, synonymous sequences used as the basis for PCR primer design to amplify selectively from wild-type or synthetic genomes. 

Elements introduced 

LoxPSym sites: Symmetrical loxP sites’? are inserted in the 3’ UTR of all non-essential genes, as well as at synthetic landmarks. LoxPsym sites lack 
the directionality of canonical loxP sites, and can therefore align in two orientations. As a result, both inversions and deletions are predicted at equal 
probability. These loxPsym sites and an inducible Cre recombinase’ form the basis of the SCRaMbLE toolkit. 

Elements not changed 

Gene order: Gene order is preserved in the synthetic yeast to prevent incorporation of a non-permissible configuration in the design phase. 
Induction of SCRaMbLE results in changes in gene order and chromosome structure; all recovered SCRaMbLEd yeast have viable genome structures. 

Noncoding regions: Except where noted, noncoding regions have not been modified. The yeast genome is well annotated; however, it is of 
paramount importance that the synthetic yeast be as fit as wild type until SCRaMbLE is induced. We therefore eschewed changes of noncoding 
regions to avoid disrupting unannotated critical elements. The few modifications that are made in noncoding sequence are kept to a minimum. 


to a variety of subtly different phenotypes that can be mapped rapidly 
by PCRTag analysis; more extensive analysis by deep sequencing will 
reveal changes in genome structure and content. As the synthetic yeast 
genome grows, opportunities for genome rearrangement will increase 
exponentially. In principle, changes in chromosome number, ploidy, 
content and structure are all possible, increasing the utility of the 
SCRaMbLE system. For example, there may be many different routes 
to a minimal genome, and exploring all of them by a hit or miss 
predictive approach is impractical and unlikely to yield comprehensive 
results. Using SCRaMbLE, many independent routes of genome min- 
imization can be explored at one time, under many environmental 
conditions, for instance by growing yeast cells long-term in serially 
transferred batch cultures, or in a chemostat or turbidistat under con- 
ditions in which Cre is minimally active. Such an approach may also 
lead to derivatives that are more fit than the parent, for example, by 
gene duplication events facilitated by the Cre-EBD/loxPsym system. 


METHODS SUMMARY 


DNA preparation. BAC DNA was prepared using the Qiagen plasmid midi kit or 
alkaline lysis'*. The following protocol modifications were made: cells were diluted 
1:100 from an overnight culture into 50 ml, grown in Luria broth with 50 ug ml! 
carbenicillin, and grown at 30°C for 14-16 h. Qiagen-purified DNA was treated 
with 601g ml" proteinase K at 37°C overnight, then extracted with phenol/ 
chloroform. DNAs prepared without a column were phenol/chloroform extracted, 
and then treated with RNase immediately before use. 


Yeast genomic DNA for use in PCRTag analysis was prepared by standard 
methods'’*. DNA preparation for recovery of the synIXR BAC into bacteria was 
as previously reported”. 

PCR conditions. PCRTags were amplified using Taq polymerase (New England 
Biolabs). Template concentrations were 1 ng tl’ for genomic DNA and 10 pg 
ul for purified BAC DNA. The following program was used: 94°C 3 min; 
30 cycles of 94°C 30s, 65°C 30s, 72 °C 308; 72°C 3 min. 

RNA analysis. Total RNA was isolated by hot acid phenol extraction. Microarray 
hybridization and data analysis were performed at the Johns Hopkins Microarray 
Core Facility (http://www.microarray.jhmi.edu). Dubious ORFs and pseudogenes 
were omitted from synIXR transcript analysis. 

Pulsed-field gels. DNAs were prepared as described elsewhere’'. The identity of 
the chromosomes was inferred from the known molecular karyotype of wild type 
(BY4743), and from lambda ladders run on the same gel. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


DNA preparation. BAC DNA was prepared using the Qiagen plasmid midi kit or 
alkaline lysis'*. The following protocol modifications were made: cells were diluted 
1:100 from an overnight culture into 50 ml, grown in Luria broth with 50 ug ml 
carbenicillin, and grown at 30°C for 14-16 h. Qiagen-purified DNA was treated 
with 601g ml" proteinase K at 37°C overnight, then extracted with phenol/ 
chloroform. DNAs prepared without a column were phenol/chloroform extracted, 
and then treated with RNase immediately before use. 

Yeast genomic DNA for use in PCRTag analysis was prepared by standard 
methods’”. DNA preparation for recovery of the synIXR BAC into bacteria was 
as previously reported”. 

PCR conditions. PCRTags were amplified using Taq polymerase (New England 
Biolabs). Template concentrations were Ing il’ for genomic DNA and 10pg 
ul”! for purified BAC DNA. The following program was used: 94°C 3 min; 
30 cycles of 94°C 30s, 65°C 30s, 72 °C 30s; 72°C 3 min. 

RNA analysis. Total RNA was isolated by hot acid phenol extraction. Microarray 
hybridization and data analysis were performed at the Johns Hopkins Microarray 
Core Facility (http://www.microarray.jhmi.edu). Dubious ORFs and pseudogenes 
were omitted from synIXR transcript analysis. 

Pulsed-field gels. DNAs were prepared as described elsewhere”'. The identity of 
the chromosomes was inferred from the known molecular karyotype of wild type 
(BY4743), and from lambda ladders run on the same gel. 

Yeast strains, transformation and tetrad analysis. Strains ABY7 and ABY8 were 
derived from strain BY4743; ABY7 (MATa) and ABY7 (MAT«) otherwise share the 
genotype his3A 1 leu2A0 ura3A0 lys2A0 met 15A0 yil001::URA3 yir039::kanMxX. All 
strain genotypes are listed in Supplementary Table 8. 

BY4743 spheroplasts were transformed with synIXR. The _ strain 
YFL054C::kanMX was transformed with synVIL restriction fragments by standard 
lithium acetate transformation. 

The synIXR-1D strain and others were backcrossed to strains ABY7 and ABY8; 
the resultant diploids were sporulated and genotyped to identify syn XR segregants. 
Phenotypic screening. Single colonies were picked into 96-well plates and grown 
for 48 h in yeast peptone dextrose (YPD) at 30 °C. (SCRaMDbLE strains were grown 
for 72h in YPD at 30°C, diluted 1:10 and grown for 4h before plating.) Tenfold 
dilutions were spotted on various types of agar medium and selective conditions in 
OmniTrays (NUNC), as previously described’’. Most cells were grown for 72h 
(except those grown on yeast extract/peptone/glycerol/ethanol (YPGE) plates, 
which were grown for 108 h), then scored for growth and photographed. 

Yeast growth and media. Unless otherwise indicated, all experiments were per- 
formed at 30°C. YPGE was supplemented with 2% ethanol and 2% glycerol. 
Concentrations of drugs were as follows: hydroxyurea, 0.2 M; methylmethane 
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sulphonate, 0.05%; 6-azauracil, 100 yg ml |; benomyl, 15 pg ml |; hydrogen 
peroxide, 1 mM; cycloheximide, 101gml~'. Resistance to cycloheximide and 
hydrogen peroxide was assayed by growing cells in treated medium for 2h, then 
plating on YPD. Other phenotypes were assayed by growing cells to mid-log phase 
in rich media, then spotting tenfold dilutions on selective media. 

Colony size measurements. Cells were plated at various dilutions so that similar 
numbers of colonies were observed on control and experimental (oestradiol- 
treated) plates. Colony size was measured using Image] software”, and normalized 
against the total number of colonies on each plate. Sample sizes for data presented in 
Fig. 4careas follows: wild-type, n = 488 colonies; wild-type + Cre + oestradiol,n = 486; 
1D, n= 395; 1D + Cre, n = 251; 1D + oestradiol, n = 416; 1D + Cre + oestradiol, 
n= 394, 

SynIXR BAC sequence analysis. The original synIXR BAC was sequenced by the 
manufacturer, Codon Devices”’. SynIXR BACs were recovered into bacteria and 
sequenced by Agencourt (Beckman Coulter Genomics), using sequencing primers 
listed in Supplementary Table 5. Repetitive sequences, including the highly internally 
repetitive MUCI open reading frame, were PCR-amplified before sequencing when 
necessary. 

Pulsed-field gels. Samples were run on a 1.0% agarose gel in 0.5 TBE (pH 8.0) 
for 20h at 14°C on a clamped homogenous electric field (CHEF) gel apparatus. 
The voltage was 3.5Vcm /, at an angle of 120° and a switch time of 60-120, 
ramped over 20h. 

NotI (Promega) digests were performed on whole chromosomes embedded in 
agarose plugs. Agarose plugs were removed from the 0.5 M EDTA storage buffer, 
washed with 0.05 M EDTA for 1 h at room temperature (~23 C), and then washed 
with X0.1 restriction enzyme buffer, followed by <1 buffer, under the same 
conditions. 

Probe preparation for northern and Southern blots. Probes were prepared 
using the Prime-It II kit (Stratagene) and hybridized using Ultrahyb hybridization 
solution (Ambion) according to the manufacturer’s instructions. 

SCRaMbLE. Cre activity was induced by exposure to 1 1M -oestradiol (Sigma- 
Aldrich) in rich media for either 48 h (integrated Cre) or 4h (episomal Cre), except 
where indicated otherwise. PCRTag analysis of Met and Lys auxotrophs was 
performed with a non-redundant array, using one primer pair per loxPsym- 
flanked segment. 
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Antidiabetic actions of a non-agonist PPARy ligand 
blocking Cdk5-mediated phosphorylation 


Jang Hyun Choi!*, Alexander S. Banks'*, Theodore M. Kamenecka”“*, Scott A. Busby**, Michael J. Chalmers®, Naresh Kumar’, 
Dana S. Kuruvilla’, Youseung Shin’, Yuanjun He?, John B. Bruning”, David P. Marciano’, Michael D. Cameron”**, Dina Laznik', 
Michael J. Jurezak®, Stephan C. Schtirer’, DuSica Vidovié’, Gerald I. Shulman®, Bruce M. Spiegelman! & Patrick R. Griffin? *4 


PPARy is the functioning receptor for the thiazolidinedione (TZD) 
class of antidiabetes drugs including rosiglitazone and pioglitazone’. 
These drugs are full classical agonists for this nuclear receptor, but 
recent data have shown that many PPARy-based drugs have a sepa- 
rate biochemical activity, blocking the obesity-linked phosphoryla- 
tion of PPARy by Cdk5 (ref. 2). Here we describe novel synthetic 
compounds that have a unique mode of binding to PPARy, com- 
pletely lack classical transcriptional agonism and block the 
Cdk5-mediated phosphorylation in cultured adipocytes and in 
insulin-resistant mice. Moreover, one such compound, SR1664, has 
potent antidiabetic activity while not causing the fluid retention and 
weight gain that are serious side effects of many of the PPARy drugs. 
Unlike TZDs, SR1664 also does not interfere with bone formation in 
culture. These data illustrate that new classes of antidiabetes drugs 
can be developed by specifically targeting the Cdk5-mediated phos- 
phorylation of PPARy. 

PPARy is a member of the nuclear receptor family of transcription 
factors and is a dominant regulator of adipose cell differentiation and 
development**. It is also the functioning receptor for the thiazolidine- 
dione (TZD) class of antidiabetic drugs such as rosiglitazone and 
pioglitazone’. These antidiabetes drugs were developed specifically 
to have high affinity and full agonism towards PPARy before their 
molecular modes of action were known*. It has therefore been assumed 
that their therapeutic actions result from their functional agonism on 
this receptor. From a clinical perspective, rosiglitazone (Avandia) and 
pioglitazone (Actos) are both highly effective oral medications for type 
2 diabetes and are well tolerated by the majority of patients’. 
Unfortunately, a substantial number of patients experience side effects 
from these drugs, including fluid retention, weight gain, congestive 
heart failure and loss of bone mineral density*”. Whereas some of 
the non-TZD full agonists have good antidiabetic activity, they also 
cause many of the same side effects, including fluid retention. 

The therapeutic role of classical agonism of PPARy was made 
somewhat confusing by the development of several compounds that 
have less than full agonist properties (partial agonists) but retain sub- 
stantial insulin-sensitizing and antidiabetic actions in experimental 
models'®"'. Furthermore, we have recently shown that many anti- 
diabetic PPARy ligands have a second, distinct biochemical function: 
blocking the obesity-linked phosphorylation of PPARy by cyclin- 
dependent kinase 5 (Cdk5) at serine 273 (ref. 2). This is a direct action 
of the ligands and requires binding to the PPARy ligand binding domain 
(LBD), causing a conformational change that interferes with the ability 
of Cdk5 to phosphorylate serine 273. Rosiglitazone and MRL24 (a selec- 
tive partial agonist towards PPARy) both modulate serine 273 phos- 
phorylation at therapeutic doses in mice. Furthermore, a small clinical 
trial of newly diagnosed type 2 diabetics showed a remarkably close 


association between the clinical effects of rosiglitazone and the blocking 
of this phosphorylation of PPARy. Thus, the contribution made by 
classical agonism to the therapeutic effects of these drugs and to their 
side effects is not clear. 

These data indicate that it might be possible to develop entirely new 
classes of antidiabetes drugs optimized for the inhibition of Cdk5- 
mediated phosphorylation of PPARy while lacking classical agonism. 
Here we describe the development of synthetic small molecules that 
bind tightly to PPARy, yet are completely devoid of classical agonism 
and effectively inhibit phosphorylation at serine 273. These com- 
pounds have a unique binding mode in the ligand binding pocket of 
PPARy. An example from this series, SR1664, shows potent and dose- 
dependent antidiabetic effects in obese mice. Unlike TZDs and other 
PPARy agonists, this compound does not cause fluid retention or 
weight gain in vivo or reduce osteoblast mineralization in culture. 

To develop a suitable ligand, we optimized compounds for (1) high 
binding affinity for PPARy, (2) blocking the Cdk5-mediated PPARy 
phosphorylation and (3) lacking classical agonism. We first identified 
published compounds that bind tightly to PPARy and have favourable 
properties as a scaffold for extensive chemical modifications. Classical 
agonism is defined here, as is standard in the nuclear receptor field, as 
an increased level of transcription through a tandem PPAR response 
element luciferase reporter. Of particular interest was compound 7b 
described previously as an extremely potent and selective PPARy partial 
agonist (30% activation compared to rosiglitazone)'*. A modular syn- 
thesis approach was used to make a series of analogues of compound 7b; 
these compounds were tested in vitro and in adipose cells (Sup- 
plementary Fig. 1c, d). Using a LanthaScreen competitive binding assay, 
SR1664 (Fig. 1a) had a half-maximum inhibitory concentration (ICs) 
of 80 nM (Supplementary Fig. 1a, b). As shown in Fig. 1b, when com- 
pared to rosiglitazone or MRL24 (a partial agonist) in a classical tran- 
scriptional activity assay, SR1664 had essentially no transcriptional 
agonism at any concentration. Rosiglitazone and SR1664 both effec- 
tively blocked the Cdk5-mediated phosphorylation of PPARy in vitro 
with half-maximal effects between 20 and 200 nM (Fig. 1c). In contrast, 
they had no effect on the phosphorylation of a well-characterized Cdk5 
substrate, the Rb protein (Fig. 1d)’*. This indicated that these com- 
pounds do not disrupt the basic protein kinase function of Cdk5. In 
addition, SR1664 was also effective at blocking Cdk5-mediated phos- 
phorylation of PPARy in differentiated fat cells (Fig. le) with no mea- 
surable difference in phosphorylation of Rb (Supplementary Fig. Le). 
Additional analogues were synthesized and four compounds were 
identified that have similar in vitro profiles (Supplementary Fig. 1b). 
SR1824 (Fig. 1a) was further characterized for its ability to block Cdk5- 
dependent phosphorylation of PPARy (Fig. 1b-e). These data demon- 
strate that ligands can be made that potently block Cdk5-dependent 
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Figure 1 | Novel PPARy ligands lack classical agonism, block 
phosphorylation at Ser 273. a, Chemical structures of SR1664 and SR1824. 
b, Transcriptional activity of a PPAR-derived reporter gene in COS-1 cells 
following treatment with rosiglitazone, SR1664 or SR1824 (n = 3). ¢ d, In vitro 
Cdk5 assay with rosiglitazone, SR1664 or SR1824 with PPARy or Rb substrates. 
IB, immunoblot; NT, not treated; pPPARy, phosphorylated PPARy; pRb, 
phosphorylated Rb. e, TNF-c.-induced phosphorylation of PPARy in 
differentiated PPARy knock-out MEFs expressing wild-type PPARy treated 
with rosiglitazone, SR1664 or SR1824. Error bars are s.e.m. 


phosphorylation of PPARy in cells while demonstrating little to no 
classical agonism. 

Of the four compounds identified as non-agonist inhibitors of 
Cdk5-mediated PPARy phosphorylation, SR1664 had adequate phar- 
macokinetic properties to move forward to biological and therapeutic 
assays. Adipogenesis was the first known biological function of 
PPAR’ and agonist ligands for PPARy have been shown to stimulate 
potently the differentiation of pre-adipose cell lines; this response has 
been widely used as a sensitive cellular test for PPARy agonism*"*"». As 
shown in Fig. 2a, rosiglitazone potently stimulated fat cell differenti- 
ation, as evidenced by Oil Red O staining of the cellular lipid. In 
contrast, SR1664 did not stimulate increased lipid accumulation or 
changes in morphology characteristic of differentiating fat cells. The 
stimulation of fat cell gene expression was also apparent with rosigli- 
tazone, as illustrated by an increased expression of genes linked to 
adipogenesis. In contrast, SR1664 induced little or no change in the 
expression of these genes (Fig. 2b). 

Another well-known effect of both rosiglitazone and pioglitazone is 
that they decrease bone formation and bone mineral density leading to 
an increase in fracture risk*’®. TZDs have also been shown to decrease 
bone mineralization in cultured osteoblasts'’. As shown in Fig. 2c, 
rosiglitazone treatment reduced the mineralization of mouse osteoblastic 
cells, as measured by Alizarin red staining. Moreover, the expression of 
genes involved in the differentiation of these cells was impaired (see 
Supplementary Fig. 2). Importantly, treatment with SR1664 did not affect 
the extent of calcification or the expression of this osteoblast gene set in 
MC3T3-E1 cells. 

Co-crystallography, mutagenesis and hydrogen/deuterium exchange 
(HDX) have all demonstrated that full agonists of PPARy affect critical 
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Figure 2 | Structural and in vitro functional analysis of SR1664. a, Lipid 
accumulation in differentiated 3T3-L1 cells treated with rosiglitazone or 
SR1664 following Oil Red O staining. b, Expression of adipocyte-enriched 
genes in these cells was analysed by qPCR (n = 3). c, Mineralization of MC3T3- 
E1 osteoblast cells as determined by Alizarin Red-S. Error bars are s.e.m.; 

*P < 0.05, **P < 0.01, ***P < 0.001. NT, no treatment. d, Overlay of 
differential HDX data onto the docking model of 2hfp bound to SR1664 (see 
Supplemental Fig. 3). This overlay depicts the difference in HDX between 
ligand-free and SR1664 bound PPARy LBD. Perturbation data are colour 
coded and plotted onto the backbone of the PDB file according to the key. n.s., 
not significant. Observed changes in HDX were statistically significant 

(P < 0.05) in a two-tailed t-test (n = 3). 


hydrogen bonds within the C-terminal helix (H12) of the receptor’. 
This interaction stabilized the AF2 surface (helix 3-4 loop, C-terminal 
end of H11 and H12) of the receptor facilitating co-activator interac- 
tions. Interestingly, high affinity partial agonists have been identified 
that do not make these interactions yet still possess some level of 
classical agonism, and several of these have been shown to bind the 
backbone amide of $342 (S370 in PPARy2) within the B-sheet of the 
LBD". More recently, we demonstrated that the proximity of ligand to 
the amide of $342 correlated with increased stability of the helix 2-helix 
2’ loop, the region of the receptor containing $273 (S245 in PPARy1) as 
determined by HDX’. Surprisingly, HDX analysis of SR1664 and 
SR1824 increased the conformational mobility of the C-terminal end 
of H11, a helix that abuts H12 (Fig. 2d); in contrast, the full and partial 
agonists stabilized the same region of H11 (Supplementary Fig. 3). 

In silico docking studies were carried out to understand the structural 
basis of SR1664 interactions in the PPARy1 ligand binding domain 
(Supplementary Fig. 4). In this model, the phenyl-substituted nitro 
group of SR1664 clashes with hydrophobic side chains of H11 such 
as Leu 452 and Leu 453 (Leu 480 and Leu 481 in PPARy2, respectively) 
as well as Leu 469 and Leu 465 (corresponding to Leu 497 and Leu 493 
in PPARy2) of the loop N-terminal to H12. This potentially explains 
the lack of stabilization of H12 and the destabilization of the region 
of H11 near His 449 as seen by HDX. Despite the altered mode of 
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binding, SR1664 and rosiglitazone both bind to the same core residues 
within the PPARy LBD. This is demonstrated by the ability of SR1664 to 
attenuate the transcriptional activity of rosiglitazone on PPARy in the 
context of a competitive ligand binding assay (Supplementary Fig. 4b). 

To determine whether the altered transcriptional activity of SR1664 
may be attributed to differences in DNA binding or coactivator recruit- 
ment, we compared the chromatin association of PPARy or steroid 
receptor co-activator-1 (SRC1) within the aP2 promoter. As expected, 
rosiglitazone significantly increased SRC1 occupancy without affecting 
PPARy occupancy. However, SR1664 treatment did not influence the 
occupancy of PPARy or SRC1 recruitment to the aP2 promoter, indi- 
cating that SR1664 has a very different activity of co-regulator recruit- 
ment (Supplementary Fig. 4c). 

We next asked whether SR1664 had antidiabetic properties in vivo. 
Wild-type mice fed a high-fat high-sugar diet become obese and insulin- 
resistant, with activation of Cdk5 in their adipose tissues’. Figure 3a 
demonstrates that SR1664, injected twice daily for 5 days, caused a dose- 
dependent decrease in the Cdk5-mediated phosphorylation of PPARy 
at serine 273 in adipose tissue. Moreover, SR1664 treatment also caused 
a trend towards lowered (and normalized) glucose levels, and a sig- 
nificant reduction in the fasting insulin levels. Insulin resistance, as 
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Figure 3 | Antidiabetic activity of SR1664 in high-fat diet (HFD) mice. 

a, Dose-dependent inhibition of phosphorylation of PPARy by SR1664 in white 
adipose tissue (WAT). Quantification of PPARy phosphorylation compared to 
total PPARy (right). b, Ad libitum-fed glucose (P = 0.062 at 10mgkg ‘), 
insulin and HOMA-IR in HFD mice. c, Glucose infusion rate (GIR), 
suppression of hepatic glucose production (HGP), whole body glucose disposal 
and WAT 2-deoxyglucose tracer uptake during hyperinsulinaemic- 
euglycaemic clamps. d, Expression of a gene set regulated by PPARy 
phosphorylation in WAT. e, Expression of an agonist gene set (see Methods) in 
WAT. Error bars are s.e.m.; *P < 0.05, **P< 0.01. 
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computed by HOMA-IR, showed a clear and dose-dependent improve- 
ment with SR1664 (Fig. 3b). These changes occurred without significant 
differences in body weight compared to vehicle-treated mice (Sup- 
plementary Fig. 5). 

The most accurate method for measuring changes in insulin sensi- 
tivity in vivo is the hyperinsulinaemic-euglycaemic clamp”. As shown 
in Fig. 3c and in Supplementary Fig. 6, the glucose infusion rate (GIR) 
needed to maintain euglycaemia in the mice treated with SR1664 was 
significantly greater than in animals treated with the vehicle alone, 
indicating improved whole-body insulin sensitivity. Suppression of 
hepatic glucose production (HGP), an important component of 
insulin action, was improved by SR1664. Whereas no difference in 
whole-body glucose disposal was detected from calculations of 
°H-glucose turnover, analysis of tissue-specific '“C-2-deoxyglucose 
transport demonstrated improved insulin-stimulated glucose disposal 
in adipose tissue of SR1664-treated mice. Similarly, reductions in both 
basal and clamped plasma free fatty acids levels, as well as a 20% 
greater suppression of lipolysis in response to insulin, indicated 
improved adipose tissue insulin sensitivity in SR1664-treated mice. 
Together, these data indicate that SR1664 improves insulin sensitivity. 

Using cells expressing the $273A mutant of PPARy, we previously 
defined a gene set in cultured adipose cells that was most sensitive to 
the phosphorylation at this site’. Treatment of mice with SR1664 
caused changes in the expression of 11/17 (65%) of these genes, all 
in the direction predicted for the inhibition of the PPARy S273 phos- 
phorylation (Fig. 3d). Adiponectin and adipsin, genes long recognized 
as being reduced in obesity**, are both induced by SR1664. We also 
defined a separate set of genes reflective of a full agonist (rosiglitazone) 
on cultured fat cells. SR1664 caused changes in expression of 6/19 
genes in this ‘agonist’ gene set; importantly, three of these changes 
were in the same direction as expected for an agonist, but three were 
changed in the opposite direction (Fig. 3e). Taken together, these data 
show that SR1664 has an insulin-sensitizing effect with preferential 
regulation of the gene set sensitive to the phosphorylation of PPARy by 
Cdks. 

A more severe model of obesity is the leptin-deficient ob/ob mouse. 
These animals are very obese and insulin-resistant, with substantial 
compensatory hyperinsulinaemia. Preliminary pharmacokinetic and 
pharmacodynamic experiments showed comparable drug exposures 
at 40mgkg ' for SR1664 and 8mgkg ' for rosiglitazone, both 
injected twice daily (Supplementary Fig. 7). Functional analyses were 
performed at days 5 and 11 after the start of treatments. As shown in 
Fig. 4a, both drugs caused a similar reduction in PPARy phosphoryla- 
tion at $273. After 5 days of treatment, there were no overt differences 
in fasting body weight or glucose levels (Fig. 4b). Control mice receiv- 
ing only the vehicle remained hyperinsulinaemic, but both rosiglita- 
zone and SR1664 substantially reduced these insulin levels (Fig. 4b). 
Glucose tolerance tests were markedly improved with both rosiglita- 
zone and SR1664, and the areas under these glucose excursion curves 
were statistically indistinguishable, without changing body weight 
(Fig. 4c). 

Weight gain and fluid retention caused by TZD drugs like rosiglita- 
zone are suspected to be key factors in their increased cardiac risk’”°. 
After recovering from the glucose tolerance test on day 5, rosiglitazone- 
treated mice began to show an increase in body weight (Fig. 4d). This 
increased mass is accounted for primarily by fluid retention, quantified 
by a decrease in haematocrit seen with haemodilution (Fig. 4f). 
However, an increase in body fat was also observed by magnetic res- 
onance imaging (Fig. 4e, f). Importantly, SR1664 treatment did not cause 
the weight gain seen with the rosiglitazone treatment. Furthermore, 
SR1664 treatment showed no decrease in the haematocrit or change 
in body adiposity. These results were confirmed by measurements 
showing a decreased concentration of haemoglobin in the mice treated 
with rosiglitazone, but not in those treated with SR1664 (Supplemen- 
tary Fig. 8). Taken together, these data indicate that SR1664, a non- 
agonist PPARy ligand, has antidiabetic actions in two murine models 
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Figure 4 | SR1664 has potent antidiabetic activity and does not promote 
fluid retention in ob/ob mice. a, Phosphorylation of PPARy in WAT (left). 
Quantification of PPARy phosphorylation compared to total PPARy (right). 
b, c, Fasting body weight, blood glucose and insulin levels before glucose- 
tolerance tests (GTT) in ob/ob mice treated with vehicle, rosiglitazone or 
SR1664 (n = 8). Whole-body weight (d) and fat change (e) with continued drug 
administration following the GTT. f, Packed cell volume (PCV) in whole blood 
from ob/ob mice treated with vehicle, rosiglitazone or SR1664. Error bars are 
s.e.m.; *P < 0.05, **P < 0.01, ***P < 0.001. n.s., not significant. 


of insulin-resistance. Furthermore, this non-agonist does not stimulate 
two of the best documented side-effects of the PPARy agonist drugs 
in vivo, 

The TZD class of drugs has been important for the treatment of type 
2 diabetes**. Whereas these drugs function as full agonists for PPARy, 
the role of agonism in their therapeutic effects has been called into 
question recently. Rosiglitazone and partial agonists like MRL24 both 
block the obesity-linked phosphorylation of PPARy at serine 273 
(ref. 2). The tight correlation between inhibition of this phosphoryla- 
tion and the therapeutic effects of these drugs in both mouse and man 
suggested that it might be possible to create new classes of non-agonist 
ligands for PPARy which are effective for the treatment of diabetes 
and cause fewer side effects. Hence, this paper addresses three key 
questions: first, is it possible to create novel PPARy ligands that 
block Cdk5-mediated PPARy phosphorylation yet have no classical 
agonism? Second, would such compounds have robust antidiabetic 
activity? Finally, would non-agonist compounds have fewer side effects 
than classical full agonists like rosiglitazone? 

We show here that it is possible to create new ligands that have high 
affinity for PPARy, block the Cdk5-mediated phosphorylation and 
completely lack classical agonism. SR1664 does not function as an 
agonist and has no adipogenic action in vitro. The structural require- 
ments for the non-agonist actions of SR1664 and SR1824 are particu- 
larly interesting. Ligands that function as classical full agonists, like 
rosiglitazone, have been shown to alter the conformation and HDX 
kinetics of H12, the major agonist helix. Surprisingly, ligands that do 
not affect the conformational dynamics of H12 are not non-agonists, 
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rather they seem to function as partial agonists'*”!”’. This strongly 
suggests that when engaged by ligands, other structural features of 
the AF2 surface such as H3, H3-H4 loops and the C-terminal end of 
H11 contribute to partial agonism of the receptor. As expected SR1664 
and SR1824 do not interact with H12 in any detectable way, but 
unexpectedly both ligands cause an increase in the conformational 
mobility of H11, which is part of the AF2 surface and directly abuts 
H12. Hence, it seems likely that the destabilization of H11 distorts the 
AF2 surface enough to block partial agonism. Whether there are other 
alternative modes of ligand binding that would lead to a complete lack 
of classical agonism remains to be determined. 

That classical agonism is not required for strong antidiabetic actions 
of a PPARy ligand is now clear. In both diet-induced and genetically 
obese animals, SR1664 has strong antidiabetic actions. The ability to 
improve adipose tissue insulin sensitivity is similar to the effects shown 
for rosiglitazone. SR1664 has inferior pharmacokinetic properties 
compared to rosiglitazone, so an absolute quantitative comparison of 
their efficacy is difficult. However, using our best calculations to get 
approximately equal exposure to the two drugs in vivo, SR1664 has 
very robust antidiabetic activity, roughly equivalent to rosiglitazone in 
the experiments shown here. The unfavourable pharmacokinetic 
properties of SR1664 strongly suggest that this compound will never 
be administered to patients but it proves that non-agonist compounds 
can have robust therapeutic effects. 

Analysis of the side effects of PPARy ligands can be difficult because 
some of these (like cardiovascular disorders) do not occur in mice 
whereas others (like loss of bone mineral density) take many months 
of treatment to manifest. However, weight gain and fluid retention 
occur rapidly in both humans and mice. Increased body weight, 
increased accretion of fat tissues and increased fluid retention all occur 
in mice within 11 days of treatment with rosiglitazone (Fig. 4). The 
non-agonist SR1664 shows none of these side effects, even as it effec- 
tively improves glucose homeostasis. Unlike rosiglitazone, SR1664 
does not affect bone cell mineralization in culture (Fig. 2c). Taken 
together, these data indicate that many of the known side effects of 
the TZD drugs occur as a consequence of classical agonism on target 
genes. Whether ligands directed at the Cdk5-mediated phosphoryla- 
tion have their own problems remains to be determined. Still, these 
studies illustrate that the development of entirely new classes of 
PPARy-targeted drugs is feasible. 


METHODS SUMMARY 

Cell culture. Adipocyte differentiation in 3T3-L1 or PPARy-null mouse embryonic 
fibroblasts (MEFs) expressing PPARy’ was induced by treating cells with 1 1M 
dexamethasone, 0.5 mM isobutylmethylxanthine, and 850 nM insulin for 48 h and 
cells were switched to the maintenance medium containing 850 nM insulin for 
6 days. 

Gene expression analysis. Total RNA was isolated from cells or tissues using 
TRizol reagent (Invitrogen). The RNA was reverse-transcribed using ABI reverse 
transcription kit. Quantitative PCR (qPCR) reactions were performed with SYBR 
green fluorescent dye using an ABI9300 PCR machine. Relative mRNA expression 
was determined by the AA-C, method using TATA-binding protein (TBP) levels. 
Animals. All animal experiments were performed according to procedures 
approved by Beth Israel Deaconess Medical Center’s Institutional Animal Care 
and Use Committee. Male C57BL/6J and C57BL/6J-Lep?”” mice (4- to 5-week- 
old) were obtained from the Jackson Laboratory. C57BL/6J mice were fed a high- 
fat, high-sucrose diet (60% kcal fat, D12492, Research Diets Inc.). For glucose 
tolerance tests, mice were injected intraperitoneally (i.p.) with rosiglitazone or 
SR1664 for 5 days, and fasted overnight before ip. injection of 1 gkg”' D-glucose. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

SR1664. (S)-4'-((5-((1-(4-nitrophenyl)ethyl)carbamoyl)-1H-indol-1-yl)methyl)- 
[1,1'-biphenyl]-2-carboxylic acid. Commercially available ethyl 2,3-dimethyl-1H- 
indole-5-carboxylate was N-alkylated with commercially available tert-butyl 
4'-(bromomethyl)biphenyl-2-carboxylate using NaH in DMF. The corresponding 
ethyl ester was hydrolysed using aqueous NaOH in ethanol to give the acid, which 
was coupled to (S)-1-(4-nitrophenyl)ethanamine using 2-(3H-[1,2,3]triazolo 
[4,5-b]pyridin-3-yl)-1,1,3,3-tetramethylisouronium hexafluorophosphate(V) (HATU) 
and diisopropylethylamine in CH,Cl, to give the amide. Final deprotection of the 
tert-butyl ester using 30% trifluoroacetic acid in CH2Cl, and purification by flash 
chromatography (ethyl acetate/hexanes 10-100%) afforded SR1664. Electrospray 
ionisation coupled with mass spectrometry (ESI-MS; m/z): 576 [M+H]*;'H NMR 
(400 MHz, dimethylsulphoxide (DMSO)-d,): 6 (p.p.m.) 8.83 (d, J = 7.6 Hz, 1H), 
8.25 (m, 1H), 8.16 (d, J = 1.2 Hz, 1H), 7.74-7.68 (m, 4H), 7.57 (dt, J = 1.6, 7.2 Hz, 
1H), 7.51 (d, J = 8.4 Hz, 1H), 7.46 (dt, J = 1.2, 7.2 Hz, 1H), 7.36 (dd, J = 0.8, 7.6 Hz, 
1H), 7.28 (m, 2H), 7.03 (m, 2H), 5.52 (s, 2H), 5.32 (quint, J = 7.2 Hz, 1H), 2.36 (s, 
3H), 2.34 (s, 3H), 1.57 (d, J= 6.8 Hz, 3H); °C NMR (400 MHz, DMSO-d,): 6 
(p.p.m.) 170.5, 167.9, 154.5, 147.2, 141.5, 140.7, 138.7, 138.2, 135.1, 133.2, 131.8, 
131.5, 130.0, 129.6, 128.6, 128.2, 128.1, 126.8, 125.8, 124.4, 121.4, 118.8, 109.7, 108.3, 
49.4, 46.7, 22.9, 11.0, 9.7. 

SR1824. (S)-4'-((5-(1-(4-bromophenyl)ethylcarbamoy])-2,3-dimethy]-1H-indol-1-yl) 
methyl)biphenyl-2-carboxylic acid (1824) was synthesized in the same manner 
using (S)-1-(4-bromophenyl)ethanamine. ESI-MS (m/z): 581/583 [M+H] *; 
‘H NMR (400 MHz, DMSO-d,): 6 (p.p.m.) 148 (d, J=J=6.8Hz, 3H, CH; 
(4-bromophenyl)ethylcarbamoyl), 2.28 (s, 3H, CH; indole), 2.32 (s, 3H, CH; 
indole), 5.17 (quintuplet, J = 7.6 Hz, 1H, CH (4-bromophenyl)ethylcarbamoyl), 
5.47 (s, 2H, CH)-biphenyl), 6.99 (d, J= 8 Hz, 2H, H, and Hg biphenyl), 7.24 
(d, J= J = 8 Hz, 2H, Hg and Hypo biphenyl), 7.31 (d, J = 7.6 Hz, 1H, H; indole), 
7.36-7.55 (m, 7H, H2, H3 and Hy, biphenyl, H, indole and H 4-bromophenyl), 8.10 
(d, = J = 1.6 Hz, 1H, Hy indole), 8.65 (d, J= 8 Hz, 1H, NH amide). '*C NMR 
(400 MHz, DMSO-d,): 6 (p.p.m.) 169.5, 166.7, 144.9, 140.5, 139.7, 137.6, 137.3, 
134.0, 132.2, 131.0, 130.8, 130.4, 129.0, 128.6, 128.4, 127.6, 127.2, 125.9, 125.0, 
120.3, 119.4, 117.7, 108.7, 107.3, 47.9, 45.7, 22.1, 10.1, 8.6. 

Cell culture. COS-1, 3T3-L1 and HEK-293 cells were obtained from ATCC. 
Adipocyte differentiation in 3T3-L1 or PPARy-null mouse embryonic fibroblasts 
(MEFs) expressing PPARy” was induced by treating cells with 1 1M dexamethasone, 
0.5 mM isobutylmethylxanthine, and 850 nM insulin with 10% FBS in DMEM for 
48 h and cells were switched to the maintenance medium containing 10% FBS and 
850 nM insulin. Lipid accumulation in the cells was detected by Oil Red O staining. 
All chemicals for cell culture were obtained from Sigma unless otherwise indicated. 
In vitro kinase assay. Active Cdk5/p35 was purchased from Millipore. In vitro 
CDK kinase assay was performed according to the manufacturer’s instructions 
(Cell Signaling Technology). Purified PPARy (0.5 ig; Cayman Chemicals) were 
incubated with active CDK kinase in assay buffer (25 mM Tris-HCl pH 7.5, 5 mM 
beta-glycerophosphate, 2mM dithiothreitol (DTT), 0.1mM Na;VO,, 10mM 
MgCl.) containing 20}1M ATP for 15min at 30°C. PPARy ligands were pre- 
incubated with the specified substrates for 30 min before the assay was performed. 
Rb (Cell Signaling Technology) was used as a positive control. 

LanthaScreen. PPARy competitive binding assay (Invitrogen) was performed accord- 
ing to the manufacturer’s protocol. A mixture of 5 nM glutathione S-transferase fused 
with the PPARy ligand binding domain (GST-PPARy-LBD ), 5nM Tb-GST-anti- 
body, 5 nM Fluormone Pan-PPAR Green, and serial dilutions of SR1664 beginning at 
10 tM downwards was added to wells of black 384-well low-volume plates (Greiner) 
toa total volume of 18 ul. All dilutions were made in TR-FRET assay buffer C. DMSO 
at 2% final concentration was used as a no-ligand control. Experiments were per- 
formed in triplicate and incubated for 2 h in the dark before analysis in Perkin Elmer 
ViewLux ultra HTS microplate reader. The FRET signal was measured by excitation 
at 340 nm and emission at 520 nm for fluorescein and 490 nm for terbium. The fold 
change over DMSO was calculated by 520 nm/490 nm ratio. Graphs were plotted as 
fold change of FRET signal for each compound over DMSO-only control. 
Cell-based transactivation assay. COS-1 cells were cotransfected in batch by 
adding 4.5 41g full-length murine PPARy2-pSV Sport or full-length human 
PPARy2-pSport6, with 4.5 ug 3 multimerized PPRE-luciferase reporter and 
27 ul X-treme Gene 9 transfection reagent in serum-free Opti-mem reduced 
serum media (Gibco). After 18-h incubation at 37°C in a 5% CO, incubator, 
transfected cells were plated in triplicate in white 384-well plates (Perkin Elmer) 
at a density of 10,000 cells per well. After replating, cells were treated with either 
DMSO vehicle only or the indicated compounds in increasing doses from 2 pM- 
10 uM for mouse receptor or 220 pM-2 1M for the human receptor. After 18-h 
incubation, treated cells were developed with Brite Lite Plus (Perkin Elmer) and 
read in 384-well Luminescence Perkin Elmer EnVision Multilabel plate reader. 
Graphs were plotted in triplicate as fold change of treated cells over DMSO-treated 
control cells. 


Ensemble docking. PPARy co-crystal structures (68 in total) with unique ligands 
were identified in the Protein Data Bank (PDB) (as of 3 January 2011). Four 
structures were selected based on the maximum similarity of the co-crystal ligands 
to SR1664; specifically 3kmg (ligand 538, 0.98 similarity), 2hfp (ligand NSI, 
similarity of 0.91), 1fm9 (ligand 570, 0.90 similarity), 2pob (ligand GW4, 0.88 
similarity). SR1664 was prepared using Schrodinger LigPrep generating tautomers 
and ionization states (pH range 7 + 2). Flexible ligand docking of SR1664 against 
the four structures was performed using Schrodinger Glide. At least one of the two 
constraints Arg 288 and Ser 342 (Arg 316 and Ser 370 in PPARy’) was required to 
score docking poses. The best docking score (Glide docking scores are meant to 
correspond to binding affinity) of —9.21 was achieved with the PPARy structure 
2hfp and SR1664 forms a hydrogen bond to Ser342 (shown in Fig. 2). 
Unconstrained docking produced almost the same docking pose with the pre- 
served hydrogen bonding to Ser 342 anda slightly less favourable docking score of 
—8.99 indicating Ser 342 as a critical ligand binding element. 

Differentiation of MC3T3-El1. After reaching confluence, cells were grown in 
a-MEM supplemented with 10% FBS, 1% penicillin-streptomycin, 200 [tM ascorbic 
acid and 10 mM -glycerophosphate. The cells were treated with either rosiglita- 
zone (10 uM) or SR1664 (10 1M) or left in vehicle at the start of differentiation. The 
cells were collected 7 days post-differentiation for gene expression analysis and 
21 days post-differentiation for mineralization. The mineralization of MC3T3-E1 
cells was determined by Alizarin red S staining (Millipore catalogue no. ECM815) as 
per manufacturer’s instructions. 

Preparation of cell or tissue lysates and immunoblotting. Differentiated 
adipocytes were pre-treated with PPARy ligands for 45 min, and incubated with 
TNF-« for 30 min. For tissue lysates, WAT from mice was homogenized in RIPA 
buffer (50 mM Tris pH 7.5, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate, 
0.1% SDS with protease and phosphatase inhibitors). For western blotting, a 
phospho-specific antibody against PPARy Ser 273 was used’. Total tissue lysates 
were analysed with an anti-PPARy antibody (Santa Cruz). 

Gene expression analysis. Total RNA was isolated from cells or tissues using 
TRIzol reagent (Invitrogen). The RNA was reverse-transcribed using the ABI 
reverse transcription kit. Quantitative PCR reactions were performed with 
SYBR green fluorescent dye using an ABI9300 PCR machine. Relative mRNA 
expression was determined by the AA-C, method normalized to TATA-binding 
protein (TBP) levels. The sequences of primers used in this study are found in 
Supplementary Table 1. 

ChIP. Differentiated 3T3-L1 adipocytes were treated on day 6 with 11M of 
compounds or vehicle for 24h. The samples were prepared using manufacturer’s 
protocol (ChampionChIP One-Day Kit, Qiagen). Briefly, cross-linked chromatin 
was sonicated and 5 1g of antibody was used to immunoprecipitate the pre-cleared 
samples. The following antibodies were used: normal rabbit IgG, PPARy (Santa 
Cruz), SRC-1 (Abcam). The promoter region of aP2 for PPAR-y binding was 
amplified using PCR with reverse transcription (RT-PCR). The primers used 
for aP2 were aP2 forward 5'‘-AAATTCAGAAGAAAGTAAACACATTATT-3’; 
aP2 reverse 5’-ATGCCCTGACCATGTGA-3’. 

Gene sets from microarray. We performed a microarray with total RNA isolated 
from PPARy-null fibroblasts expressing wild-type or $273A mutant of PPARy or 
WT cells treated with 1 uM rosiglitazone for 24h (ref. 2). To create refined gene 
sets regulated by phosphorylation of PPARy or rosiglitazone, we first calculated 
P-values as well as fold-change of gene expression in wild-type versus $273A 
mutant cells or wild-type versus wild-type /Rosiglitazone cells, and we plotted 
—log P-value versus log, fold-change. From this list of genes, we selected genes 
which were changed in magnitude (=1.4 fold difference) and statistical signifi- 
cance (P < 0.05). The selected genes were validated in cells by using qPCR, the 
resulting gene sets (phosphorylation-dependent or agonist-dependent gene sets) 
were analysed in WAT of mice using qPCR. 

Animals. All animal experiments were performed according to procedures 
approved by Beth Israel Deaconess Medical Center’s Institutional Animal Care 
and Use Committee. Male C57BL/6] or C57BL/6]-Lep?”” mice (4- to 5-week-old) 
were obtained from the Jackson Laboratory. C57BL/6] mice were fed a regular diet 
(10% kcal fat, D12450B, Research Diets Inc.) or a high-fat, high-sugar diet (60% 
kcal fat, D12492, Research Diets Inc.) for either 8, 10 or 18 weeks. The mice were 
intraperitoneally (i-p.) injected twice daily with 4 mgkg’ | rosiglitazone or 20 mg 
kg ' SR1664 for 6 days before gene expression analysis or hyperinsulinaemic- 
euglycaemic clamp experiments. Clamps were performed essentially as previously 
described with one exception to the standard protocol”. As the mice were fed a 
high-fat diet for 8 weeks before the clamp studies, a higher insulin infusion rate of 
4mU (kg-min) “7 was used instead of the typical 3 mU (kg-min)~? for standard 
chow studies. For glucose tolerance tests, 6-week-old male C57BL/ 6J-Lep™ ob mice 
were i.p. injected twice daily with 8 mgkg * rosiglitazone or 40 mgkg * SR1664 
for 6 days, and fasted overnight before i.p. injection of 1 gkg_' D-glucose. Glucose 
was measured by tail vein bleeds at the indicated intervals using a Truetrack 
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glucometer. Serum insulin concentrations were determined by ELISA (Crystal 29. Kim, H. J. etal. Differential effects of interleukin-6 and -10 on skeletal muscle and 
Chem). liver insulin action in vivo. Diabetes 53, 1060-1067 (2004). 
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Absence of effects of Sir2 overexpression on lifespan 
in C. elegans and Drosophila 


Camilla Burnett!*, Sara Valentini!*, Filipe Cabreiro!, Martin Goss, Milan Somogyvari’, Matthew D. Piper’, Matthew Hoddinott’, 
George L. Sutphin**, Vid Leko”, Joshua J. McElwee’, Rafael P. Vazquez-Manrique°”, Anne-Marie Orfila®’, Daniel Ackerman!, 
Catherine Au’, Giovanna Vinti!, Michéle Riesen', Ken Howard?, Christian Neri®’, Antonio Bedalov’, Matt Kaeberlein**, 


Csaba Soti*, Linda Partridge’? & David Gems! 


Overexpression of sirtuins (NAD* -dependent protein deacetylases) has 
been reported to increase lifespan in budding yeast (Saccharomyces 
cerevisiae), Caenorhabditis elegans and Drosophila melanogaster’. 
Studies of the effects of genes on ageing are vulnerable to confound- 
ing effects of genetic background*. Here we re-examined the reported 
effects of sirtuin overexpression on ageing and found that standard- 
ization of genetic background and the use of appropriate controls 
abolished the apparent effects in both C. elegans and Drosophila. In 
C. elegans, outcrossing of a line with high-level sir-2.1 overexpres- 
sion’ abrogated the longevity increase, but did not abrogate sir-2. 1 
overexpression. Instead, longevity co-segregated with a second-site 
mutation affecting sensory neurons. Outcrossing of a line with low- 
copy-number sir-2.1 overexpression’ also abrogated longevity. A 
Drosophila strain with ubiquitous overexpression of dSir2 using 
the UAS-GALA4 system was long-lived relative to wild-type controls, 
as previously reported’, but was not long-lived relative to the appro- 
priate transgenic controls, and nor was a new line with stronger 
overexpression of dSir2. These findings underscore the importance 
of controlling for genetic background and for the mutagenic effects 
of transgene insertions in studies of genetic effects on lifespan. The 
life-extending effect of dietary restriction on ageing in Drosophila 
has also been reported to be dSir2 dependent*. We found that dietary 
restriction increased fly lifespan independently of dSir2. Our find- 
ings do not rule out a role for sirtuins in determination of metazoan 
lifespan, but they do cast doubt on the robustness of the previously 
reported effects of sirtuins on lifespan in C. elegans and Drosophila. 

The role of sirtuins in ageing was discovered in budding yeast, where 
overexpression of SIR2 increases replicative lifespan’. It was then 
reported that elevated sirtuin levels increase lifespan in the nematode 
C. elegans'?* and the fruitfly Drosophila’, indicating an evolutionarily 
ancient role of sirtuins in longevity assurance’. Dietary restriction 
(reduced food intake short of starvation) extends lifespan in organisms 
ranging from yeast to mammals’, and initial studies indicated that 
dietary restriction increases lifespan by activating sirtuins in yeast”, 
C. elegans’ and Drosophila’. Pharmacological activation of sirtuins 
has therefore been widely promulgated as a potential means to mimic 
dietary restriction and slow ageing in humans''. However, several 
aspects of the role of sirtuins in ageing have proved controversial’’. 
Subsequent studies have indicated that sirtuins do not mediate the 
effects of dietary restriction on ageing, at least in budding yeast and 
C. elegans'*"*. The plant-derived polyphenol resveratrol and other 
compounds have been reported to activate sirtuins and extend life- 
span’>'®, but more recent findings have challenged both effects’”°. 
We therefore re-examined the effects of sirtuin overexpression on 
lifespan in C. elegans and Drosophila. In particular, we wished to 


exclude the possibility that the increased longevity observed in strains 
with overexpression of sirtuin genes is caused by differences in genetic 
background, or by the mutagenic effects of transgene insertion, which 
frequently confound studies of the genetics of ageing*. 

We first examined a high-copy-number sir-2. 1 transgenic C. elegans 
strain (LG100) carrying the integrated transgene array geln3 [sir-2.1 
rol-6(su1006)] (ref. 1). As expected, this strain was long-lived (Fig. la 
and Supplementary Table 1). However, outcrossing (5) of geIn3 to 
wild type (N2) abrogated the increase in longevity (Fig. la and 
Supplementary Table 1) without affecting SIR-2.1 protein levels 
(Fig. 1b). This loss of longevity upon outcrossing was verified by an 
independent research team (Supplementary Table 2). 

LG100 showed a neuronal dye-filling (Dyf) defect”’ that did not 
segregate with the transgene upon outcrossing (Supplementary Fig. 2a). 
Dyf mutants often show extended lifespan”. To determine whether the 
longevity of LG100 might be attributable to a dyf mutation, we derived 
from this strain three Dyf, non-Rol lines (lacking geIn3) and three non- 
Dyf, Rol lines (carrying geIn3). Dyf, non-Rol lines were long-lived and 
showed wild-type SIR-2.1 protein levels (Fig. 1c, d and Supplementary 
Table 3). Non-Dyf, Rol lines showed elevated SIR-2.1 protein levels but 
had wild-type lifespans. Dyf mutant longevity seemed to be partially 
dependent on daf-16 (Supplementary Fig. 2b), as seen previously for 
other Dyf mutants”. The co-segregation of longevity with this dyf 
mutation, but not with geIn3, was previously noted by another research 
team (S. S. Lee, personal communication). Furthermore, knockdown 
of sir-2.1 expression in LG100 using RNA-mediated interference did 
not suppress longevity, despite lowering SIR-2.1 protein to wild-type 
levels (Fig. le, f and Supplementary Table 4). Taken together, these 
results indicate that the longevity of LG100 is attributable to an un- 
identified dyf mutation (or possibly another mutation closely linked to 
the dyf locus), and that high-level overexpression of sir-2.1 is not suf- 
ficient to increase lifespan in these strains. 

A low-copy-number transgenic strain (NL3909) overexpressing sir-2. 1 
(ref. 7) is also long-lived’. We confirmed the increased lifespan of NL3909 
(pkIs1642 [sir-2.1 unc-119] unc-119(ed3)) relative to the control strain 
NL3908 (pkIs1641 [unc-119] unc-119(ed3)) (Fig. 1g and Supplementary 
Table 5). We also observed an apparent increase in SIR-2.1 protein levels 
in NL3909 relative to NL3908 (Fig. 1h). Outcrossing (<6) of NL3909 
once again abrogated longevity (Fig. 1g and Supplementary Table 5) 
without affecting SIR-2.1 protein levels (Fig. 1h and Supplementary 
Fig. 1c). This effect of outcrossing was independently verified (Sup- 
plementary Table 6). Thus, the longevity of NL3909 also seems to be 
attributable to effects of genetic background rather than to pkIs1642. 

The duplication mDp4 includes the sir-2.1 locus, and the mDp4- 
containing strain DR1786 is long-lived’. We found that DR1786 is 
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Figure 1 | Longevity of LG100 and NL3909 is not attributable to sir-2.1 
overexpression in C. elegans. a, b, Outcrossing of LG100 removes lifespan 
extension without affecting SIR-2.1 protein levels. Data in b are derived from 
western blots (mean of three trials, each using an independent protein 
preparation). A representative western blot is shown in Supplementary Fig. la. 
Quantitative reverse transcriptase PCR showed that sir-2.1 mRNA is also 
elevated in both strains (data not shown). ¢, LG100-derived Dyf, non-Rol 
segregant lines are long-lived whereas non-Dyf, Rol lines are not. d, Non-Dyf 
Rol segregant lines have elevated SIR-2.1 levels, whereas Dyf, non-Rol lines do 
not. e, f, sir-2.1 RNAi does not suppress LG100 longevity, but reduces SIR-2 
protein levels. g, h, Outcrossing of NL3909 removes lifespan extension without 
affecting SIR-2.1 protein levels. See Supplementary Tables 1-5 for lifespan 
statistics for a, c, e and g, respectively. OE, overexpression. All error bars 
represent s.e.m. *, 0.01 < P< 0.05; **, 0.001 < P< 0.01; ***, P< 0.001; NS, 
not significant; Student's t-test (two-tailed). One remaining possibility is that 
the outcrossed sir-2.1 strains both contain second-site mutations that suppress 
longevity effects. However, daf-2 RNAi strongly induced longevity in both 
strains (data not shown), arguing against the presence of a general suppressor of 
longevity in each case. 


indeed long-lived, and also shows elevated sir-2.1 expression. 
However, longevity was not suppressed by sir-2.1 RNA interference 
(RNAi) (Supplementary Fig. 3 and Supplementary Table 7) indicating 
causation by factors other than sir-2.1, either on mDp4 or elsewhere in 
the genome. 

In Drosophila, overexpression of dSir2 reportedly increases lifespan 
relative to wild-type controls’. Overexpression was achieved using the 
GAL4-UAS binary system”, with the largest increases in lifes sb being 
produced by the combination of EP-UAS-dSir2 (dSir2°?™°°) with a 
ubiquitously expressed tubulin-GAL4 driver. We outcrossed these 
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Figure 2 | Absence of effects of dSir2 on lifespan in Drosophila. All lines 
were outcrossed into w?*” (+/+). a, Lifespan in flies overexpressing dSir2"?> 
driven via tubulin-GAL4 (tub-GAL4) is longer than in the wild type, but not 
longer than in the tubulin-GAL4/+ genetic control. Median lifespans: 
+/+, 39 days; dSir2"*°/tubulin-GAL4, 59 days; dSir2"°°°"/+, 53 days, 
tubulin-GAL4/+, 60 days. P= 0.0006 for comparison of dSir2®? /tubulin- 
GALA versus dSir2°??°°/+; P = 0.9295 for dSir2™??>°°/tubulin-GAL4 versus 
tubulin-GAL4/+; P <.0.0001 for dSir2°??*°°/ tubulin-GAL4 versus +/+. 
b, Lifespan in flies overexpressing dSir2—Myc9 is longer than in wild type, but 
not longer than in the tubulin-GAL4 control. Median lifespans: +/+, 39 days; 
dSir2-Myc9/tubulin-GAL4, 67 days; dSir2-Myc9/+, 41 days; tubulin-GAL4/ 
+, 60 days. dSir2-Myc9/tubulin-GAL4 versus dSir2-Myc9/+, P = 0.0001; 
dSir2-Myc9/tubulin-GAL4 versus tubulin-GAL4/+, P = 0.1354; dSir2-Myc9/ 
tubulin-GAL4 versus +/+, P< 0.0001. All comparisons were made using log- 
rank tests, n = 200. c, The effect of dietary restriction on Drosophila lifespan is 
not dSir2-dependent. Flies were assayed over five concentrations of SYA media 
and data are presented as the median lifespan on each food concentration. All 
lines were outcrossed into Canton S (+/+). P values confirm that all flies 
respond normally to dietary restriction when median lifespans are compared 
for dietary restriction (DR) versus fully-fed (FF) conditions”. 
two transgenes (6) into the control white Dahomey (we) back- 
ground. When assayed on a medium similar to that used in the original 
study, EP-UAS-dSir2/tubulin-GAL4 flies were longer-lived than wild- 
type controls, as previously reported’ (Fig. 2a). However, they did not 
live longer than the tubulin-GAL4/+ control flies (Fig. 2a). This 
implies that lifespan extension is due to transgene-linked genetic 
effects other than the overexpression of dSir2. Lifespan was assayed 
on a range of food media (see Methods for details) to test for nutrient 
dependence of any effect. However, in no case were EP-UAS-dSir2/ 
tubulin-GAL4 flies longer-lived than one or both transgenic controls 
(Supplementary Fig. 4). 

The lack of an observable effect on lifespan could reflect the relatively 
modest increase in dSir2 expression in EP-UAS-dSir2/tubulin-GAL4 


22 SEPTEMBER 2011]! VOL 477 | NATURE | 483 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


flies, both in terms of messenger RNA levels (Supplementary Fig. 5) 
and protein levels (increased by 35% relative to wild type; Sup- 
plementary Fig. 6). We therefore created lines with a higher level of 
overexpression of dSir2 (UAS-dSir2-Myc9/tubulin-GAL4). Here, 
dSir2 mRNA and protein levels were robustly increased relative to wild 
type (an increase of 318% relative to wild-type protein levels; Sup- 
plementary Figs 5 and 6). We examined recombinant protein raised 
in Escherichia coli to check that the presence of the Myc tag did not 
interfere with dSir2 histone deacetylase activity, as measured by 
deacetylation of the fluorophore-containing p53 substrate (Fluor de 
Lys) or of native acetylated histone H4 substrates, and it did not 
(Supplementary Fig. 7). We also found that dSir2 histone deacetylase 
activity was unaffected by addition of resveratrol in either assay 
(Supplementary Fig. 7). We saw no increase in lifespan in UAS- 
dSir2-Myc/tubulin-GAL4 flies relative to tubulin-GAL4/+ controls, 
either on a food medium similar to that used in the original study 
(Fig. 2b), or relative to either control on a range of other media 
(Supplementary Fig. 4b, c, f). An independent research team also 
saw no increase in lifespan in UAS-dSir2-Myc9/tubulin-GAL4 flies 
(Supplementary Fig. 8). These results indicate that the previously 
observed longevity of EP- UAS-dSir2/tubulin-GAL4 flies was not attri- 
butable to elevated expression of dSir2, and that stronger, ubiquitous 
overexpression of dSir2 also does not extend fly lifespan. 

The role of sirtuins in the extension of lifespan by dietary restriction 
in yeast and C. elegans is controversial, with several groups reporting 
that sirtuins are not required for lifespan extension via dietary restric- 
tion in both organisms’. In Drosophila, it was reported that dietary 
restriction does not increase lifespan in dSir2 deletion-mutant flies’. 
We tested this too, using the deletion alleles dsir2*° (tested previ- 
ously?) and dSir2'’. We first outcrossed these alleles (Supplementary 
Fig. 9a) into the Canton S wild type (see Methods), which was used in 
the previous dietary-restriction study’. We then checked the effect of 
each allele on dSir2 gene expression. The allele dSir2'’ abrogated dSir2 
mRNA, indicating that this is a null allele. By contrast, dSir2*°, which 
contains a relatively small deletion at the 5’ end of the gene, did not 
reduce dSir2 mRNA levels (Supplementary Fig. 9b, c). 

To reassess the role of dSir2 in dietary restriction in Drosophila, we 
compared lifespans of wild-type (Canton S$), dSir2*° and dSir2"” 
homozygotes. All genotypes responded similarly and normally to dietary 
restriction in trials conducted by two independent research teams 
(Fig. 2c and Supplementary Fig. 10), hence the effect of dietary restriction 
on lifespan did not require dSir2. 

In this study, we were unable to verify the effect of sirtuin over- 
expression on lifespan in either C. elegans or Drosophila. Increased 
lifespan was seen in two C. elegans lines with elevated sir-2.1 expression, 
derived from independent studies, as previously reported, but in each 
case this was abrogated by outcrossing. Overexpression of sir-2.1 does 
exert effects on traits other than lifespan. For example, geIn3 is neuro- 
protective in a worm model of neuron dysfunction in Huntington’s 
disease** and, notably, this effect is not attributable to the dyf mutation 
(Supplementary Fig. 11). Moreover, both NL3909 and its outcrossed 
derivative are thermotolerant (M. Somogyvariand C. Séti, unpublished 
data). In Drosophila, lines overexpressing dSir2 were longer-lived than 
wild-type controls, as previously reported, but they were not longer- 
lived than lines containing the appropriate transgenic controls. The fact 
that all transgenic lines were longer-lived than the Dahomey wild type 
into which they had been outcrossed could reflect heterosis in the 
vicinity of the transgene inserts, or a mutagenic effect of the GAL4 
insert. 

Lifespan was not increased either by overexpression of sir-2.1 from 
its own promoter in C. elegans, or by ubiquitous overexpression of 
dSir2 from a heterologous promoter in Drosophila. Our findings call 
into question the robustness of earlier reports of a role for sirtuins in 
longevity assurance on the basis of overexpression in C. elegans and 
Drosophila, and also the role of dSir2 in the response to dietary restric- 
tion in Drosophila. However, sirtuins can affect lifespan in animals 
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under certain conditions: C. elegans daf-2(e1370) mutants are hyper- 
sensitive to genetic effects on lifespan’’, and in these mutants, deletion 
of sir-2.1 reproducibly increases lifespan® (Supplementary Fig. 12). 

Our finding that resveratrol does not activate the histone deacety- 
lase activity of dSir2 using a native histone H4 peptide is consistent 
with earlier findings using yeast SIR2 and mammalian SirT1 (refs 17, 
18). Resveratrol increased Drosophila lifespan in one study”° but not in 
another”. In principle, this could reflect sensitivity of resveratrol 
effects to subtle differences in culture conditions. If this were the case, 
our findings would indicate that such effects are not attributable to 
direct activation of dSir2 by resveratrol. 


METHODS SUMMARY 


Nematode strains and maintenance. Nematodes were maintained on nematode- 
growth-medium agar at 20°C, with E. coli OP50 bacteria as a food source. 
Nematode strains used included: wild type (N2), GA707 wuEx166 [rol- 
6(su1006)] (rol-6 control), LG100 geIn3 [sir-2.1 rol-6(su1006)] dyf-?(wu250), 
NL3909 pkIs1642 [sir-2.1 unc-119] unc-119(ed3) and the control strain NL3908 
pkIs1641 [unc-119] unc-119(ed3). 

Nematode lifespan measurements. These were performed as previously 
described”, at 20°C. To prevent progeny production, 5-fluoro-2'-deoxyuridine 
(FUdR) was added to seeded plates, to a final concentration of 10, 40 or 50 uM. 
Before testing the effects of RNAi on lifespan, worms were kept for two generations 
on the RNAi bacteria. The statistical significance of effects on lifespan was esti- 
mated using the log-rank test, performed using JMP, Version 7 (SAS Institute). 
Drosophila stocks and maintenance. Tubulin-GAL4 and dSir2°?? were 
obtained from the Bloomington Stock Center. The dSir2-Myc2 and dSir2-Myc9 
lines were generated by germline transformation into strain w”*. The dSir2**/ 
SM6B, dSir2! 71Cyo and Canton S lines were gifts from S. Pletcher, J. Rine and S. 
Helfand. All lines were outcrossed at least six times into the relevant controls. 
Experiments were performed at 25 °C on a 12h:12h light:dark cycle at constant 
humidity. 

Drosophila lifespan assays. Flies were bred at standard density, allowed to mate 
for 48 h after emerging, then sorted into ten females per vial. Vials were changed 
every 48 h, and deaths per vial were scored until all flies were dead. In overexpres- 
sion studies, n = 200. In dSir2-mutant studies, n = 100. For statistical methodo- 
logy, see earlier. 

dSir2 deacetylation assays. We used both the SirT1 Fluorimetric Drug Discovery 
Kit (Enzo Life Sciences) and an HPLC-based acetyl-histone-H4 deacetylation 
assay’. dSir2 and dSir2-Myc were cloned into pET SUMO (Invitrogen) and 
proteins were purified on HisPur cobalt spin columns (Thermo Scientific). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Nematode strains and maintenance. Caenorhabditis elegans were cultured under 
standard monoxenic conditions*'*’. Strains used included N2 (wild type), GA707 
wuEx166 [rol-6(su1006)], HT1593 unc-119(ed3), LG100 geIn3 [sir-2.1 rol- 
6(su1006)] dyf-?(wu250), NL3908 pkIs1641 [unc-119] unc-119(ed3) and NL3909 
pkIs1642 [sir-2.1 unc-119] unc-119(ed3). 

Outcrossing of nematode strains. LG100 was outcrossed with N2 and the Rol 
trait was used to detect the presence of gIn3. NL3908 and NL3909 were outcrossed 
using HT1593 unc-119(ed3). Rescue of Unc (uncoordinated movement) was used 
to detect the presence of the transgene array. 

Isolation of Dyf, non-Rol and non-Dyf, Rol lines. LG100 was crossed with N2 
and lines were established from individual F, animals with Dyf, non-Rol or non- 
Dyf, Rol phenotypes. The Dyf phenotype was identified by staining with the dye 
1,1'-dioctadecyl-3,3,3'3'-tetramethylindocarbocyanine perchlorate (Dil) and 
looking for absence of dye uptake into the amphid and phasmid neurons. Non- 
Dyf, Rol F, animals that were heterozygous for the geIn3 transgene array (the rol-6 
marker is dominant) were identified by the presence of non-Rol animals in the F3, 
and were excluded. 

RNAi in C. elegans. Animals were fed E. coli containing the HT115 vector, either 
with or without a portion of the sir-2.1 gene cloned into it. The sir-2.1 feeding 
strain was obtained from the Ahringer RNAi library*’. Worms were maintained on 
RNAi feeding strains for two generations before lifespan measurements. One day 
before starting measurements, FUdR was applied to seeded plates at 10 UM to 
prevent progeny production. 

Analysis of SIR-2.1 protein levels in C. elegans. Protein was prepared from 
synchronous nematode cultures (L4 larvae and young adults) raised on E. coli 
OP50 or RNAi bacteria for two generations. Western blots were performed with 
anti-actin monoclonal antibodies (Santa-Cruz Biotechnology) and an anti-SIR-2.1 
polyclonal antibody (provided by A. Gartner™). For all assays, 3-5 replicate worm 
cultures were used. 

Neuroprotection assays in C. elegans. To test for sirtuin-mediated protection 
from expanded polyglutamines (polyQs), we crossed GA919 (geIn3 dissociated 
from dyf-?(wu250)) to strains carrying integrated polyQ arrays. These polyQ 
strains co-express the first 57 amino acids of human huntingtin with either 19 
or 128 Gln residues fused to cyan fluorescent protein and expressed from the 
mec-3 promoter, and YFP expressed from the mec-7 promoter in touch-receptor 
neurons”. The response to touch at the tail was tested as previously described”. 
Three trials were performed and 150-200 animals were tested per genotype. 
Lifespan analysis in C. elegans. Lifespans of synchronized population cohorts 
were measured as previously described”*. FUdR was applied to the plates at 10, 40 
or 501M (see Supplementary Tables). Lifespan experiments were performed at 
20 °C. A small proportion of animals were censored, usually due to uterine rupture, 
which mainly occurred at mid-adulthood (~day 9-11). 

Statistical analysis of C. elegans data. The statistical significance of effects on 
lifespan was estimated using the log-rank test, performed using JMP, Version 7 
(SAS Institute). 

Drosophila stocks and maintenance. Tubulin-GAL4 and dSir2"’*” lines were 
obtained from the Bloomington Stock Center. dSir2-Myc2 and dSir2-Myc9 lines 
were generated by germline transformation. These were outcrossed into white 
Dahomey (w?). The strains dSir2*°/SM6B (ref. 35) and dSir2"’/Cyo (ref. 36), 
provided by S. Pletcher and J. Rine, were outcrossed into Canton S. All lines were 
outcrossed at least six times. The presence of the deletion was detected by PCR 
using the following primers: 149F (5’-AGATATGACATAAGGCAGTGGC-3’), 
1427R (5'-TCCCGTTAGCACAATGATCTTC-3’) and 3909R (5’-GAAGGCGG 
TAGCAATGG TGACAA-3’). Flies were maintained at 25 °C ona 12 h:12h light: 
dark cycle at constant humidity. 

Myc-tagged dSir2. The Myc tag was added to RE27621 (Riken) using standard 
techniques and cloned into pUASP. The construct was microinjected into w™ and 
the transformant lines dSir2-Myc2 and dSir2-Myc9 were recovered. Primers were: 
Sir5'R2 (5'-CAAGAATTCCAACGAGAATTTTACACAGGTCGTGTG-3’), Sir3’Xba 
(5'-ATC GAGTCTAGACACTGCTGCTAACTGTCCTGGAGG-3’) and MYC3’Xba 
(5'-GAGCT ATCTAGAGGATCCGAGGAGCAGAAGCTGATC-3’). 

Lifespan assays in Drosophila. Flies were bred at standard density (~300 flies per 
200-ml bottle), allowed to mate for 48h after emerging (once mated) and then 


sorted into ten females per vial (experiments performed at University College 
London) or 35 per vial on 15% SYA (experiments performed at University of 
Michigan). Vials were changed every 48h and deaths per vial were scored until 
all flies were dead. The numbers of flies used in lifespan assays were: overexpres- 
sion studies, n ~ 200 (UCL) or n ~ 350 (U. Michigan); dietary-restriction studies, 
n= 100. For the overexpression studies, the fly-food recipes were as follows: SYA 
(100 g yeast, 50 g sugar, 15 g agar, 30 ml nipagin and, in most trials, 3 ml propionic 
acid per litre of food); ASG (20 g yeast, 85 g sugar, 10 g agar and 60 g maize per litre 
of food); ASG! (31g yeast, 124 g sugar, 9 g agar, 53 g cornmeal and 25 ml nipagin 
per litre of food); 15% SYA (150 g yeast, 150 g sugar, 21 g agar and 15 ml tegosept). 
For the dietary-restriction trials, the food dilutions used were as follows: 15 g agar, 
30 ml nipagin, 3 ml propionic acid, with yeast and sugar both altered to final 
concentrations of 10g, 50g, 100g, 150g or 200g per litre of food. All food was 
prepared as previously described”’. 

Genetic crosses in Drosophila. Tubulin-GAL4/TM3 males were crossed to 
AdSir2EP2300, dSir2-Myc2 or dSir2-Myc9 virgin females, and dSir2®??/ +; 
tubulin-GAL4/+, dSir2-Myc2/+; tubulin-GAL4/+ or dSir2-Myc9/+; and 
tubulin-GAL4/+ females were selected from the progeny. For the controls, 
tubulin-GAL4/TM3 males or dSir2"?°°°, dSir2-Myc2 or dSir2-Myc9 virgin 
females were crossed to w??" and dSir2®??3°/+, tubulin-GAL4/+, dSir2-Myc2/ 
+ or dSir2-Myc9/+ females were selected from the progeny. 

Quantitative reverse transcriptase PCR in Drosophila. RNA was extracted from 
ten females at 10 days of age using standard techniques and transcribed into 
cDNA. Four biological replicates were run per genotype, each in triplicate. 
Samples were normalized to either actin5C or ribosomal protein 49 (RP49). 
Primers used were: Sir2-4 5'-GCTCTCCACCGTTGTCTGAGGGCC-3’ (ref. 3), 
Sir2-5 5'-GGCGGCAGCTGTGCTGCGATGAG-3’ (ref. 3), Actin5CF 5’-CAC 
ACCAAATCTTACAAAATGTGTGA-3’, ActinCR 5'-AATCCGGCCTTGCAC 
ATG-3', RP49F 5’-ATGACCATCCGCCCAGCATCAGG-3’ and RP49R 
5'-ATCTCGCCGCAGTAAACG-3’. 

Analysis of dSir2 protein levels. Protein was extracted from 30 females at 7 days 
of age. Western blots were performed using antibodies c-myc 9E10 (Santa Cruz 
Biotechnology), p2E2 (Developmental Studies Hybridoma Bank) and tubulin 
(Sigma). 

dSir2 deacetylation assays. Sequences encoding dSir2 (RE27621) and dSir2-Myc 
were cloned into pET SUMO (Invitrogen) and proteins were purified on HisPur 
cobalt spin columns (Thermo Scientific). For the Fluor de Lys assay, using the 
SirT1 Fluorimetric Drug Discovery Kit (Enzo Life Sciences), results presented are 
the mean + s.e.m. of three biological replicates. In each biological replicate, sam- 
ples were run in triplicate. Final concentrations were: resveratrol and suramin, 
0.2mM; NAD* 0.1 mM. Deacetylation of native acetyl-histone-H4 peptide was 
monitored by HPLC”. Deacetylation of histone H4 amino-terminal peptide 
(SGRGKGGKGLGKGGA(acetyl-K)RHRC) (Biomatik) was carried out using 
500 uM NAD*, 100 mM Tris-HCl (pH 8.0), 150 mM NaCl, 0.5 mM dithiothreitol 
and 0.05% Triton X-100, and monitored by HPLC (Agilent 1100) with an ACE 
C8-300 150 3.0mm column. The elution profiles were analysed using 
Chemstation for LC 3D software. 

Statistical analysis of Drosophila data. Survivorships and the response to dietary 
restriction were compared using the log-rank test and analyses were performed 
using JMP, Version 7 (SAS Institute). 
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Bacteria and archaea acquire resistance to viruses and plasmids by 
integrating short fragments of foreign DNA into clustered regularly 
interspaced short palindromic repeats (CRISPRs). These repetitive 
loci maintain a genetic record of all prior encounters with foreign 
transgressors’ °. CRISPRs are transcribed and the long primary 
transcript is processed into a library of short CRISPR-derived 
RNAs (crRNAs) that contain a unique sequence complementary to 
a foreign nucleic-acid challenger”"’”. In Escherichia coli, crRNAs are 
incorporated into a multisubunit surveillance complex called 
Cascade (CRISPR-associated complex for antiviral defence), which 
is required for protection against bacteriophages'*"*. Here we use 
cryo-electron microscopy to determine the subnanometre structures 
of Cascade before and after binding to a target sequence. These 
structures reveal a sea-horse-shaped architecture in which the 
crRNA is displayed along a helical arrangement of protein subunits 
that protect the crRNA from degradation while maintaining its 
availability for base pairing. Cascade engages invading nucleic acids 
through high-affinity base-pairing interactions near the 5’ end of the 
crRNA. Base pairing extends along the crRNA, resulting ina series of 
short helical segments that trigger a concerted conformational 
change. This conformational rearrangement may serve as a signal 
that recruits a trans-acting nuclease (Cas3) for destruction of invad- 
ing nucleic-acid sequences. 

The CRISPR RNA-guided adaptive immune system in Escherichia 
coli K12 consists of eight cas genes and a downstream CRISPR locus 
(Fig. 1a). Cascade is a 405-kDa ribonucleoprotein complex composed 
of 11 subunits of five functionally essential Cas proteins (one CasA 
protein, two CasB proteins, six CasC proteins, one CasD protein and 
one CasE protein) and a 61-nucleotide crRNA’*™*. Previous structural 
and biochemical studies have determined the composition and general 
morphology of the Cascade complex. However, the arrangement of 
subunits and the mechanism of target recognition remain largely 
unknown. Using single-particle cryo-electron microscopy (cryo- 
EM), we determined the structure of Cascade at a resolution of 
~8 A (Fig. 1, Supplementary Fig. 1 and Supplementary Movie 1). 
This structure provides a detailed description of the subunit organiza- 
tion with sufficient resolution to observe secondary structure elements 
within each of the 11 protein components and the crRNA. 

Overall, Cascade has a sea-horse-shaped architecture with a helical 
backbone, as suggested by two-dimensional electron microscopy“. The 
backbone is capped at its ends by two prominent features representing 
the ‘head’ (CasE) and ‘tail’ (CasA) of the sea-horse anatomy. The 
resolution of the cryo-EM reconstruction, together with crystal struc- 
tures of two individual subunits and previously established subunit 
stoichiometries, allowed us to delineate the molecular boundaries of 
all the individual components. The resulting model provides detailed 
insight into Cascade organization (Fig. 1, Supplementary Fig. 2 and 
Supplementary Movie 2). 


The backbone of Cascade consists of six copies of CasC organized in 
a helical stack. Integral to the spine, the crRNA lies in a groove on the 
concave surface of the CasC helix. The extended conformation of the 
crRNA explains its importance for complex assembly and suggests that 
is has a structural role as a template for CasC subunit association. 

The crRNA is anchored at both ends of the Cascade complex by 
specific protein-RNA interactions that can be seen in the cryo-EM 
structure. CasE is the endoribonuclease that specifically binds to a 
stem-loop in the CRISPR transcript, and cleavage results in a 61- 
nucleotide crRNA’*™*. A co-crystal structure of the CasE homologue 
from Thermus thermophilus in complex with the crRNA stem-loop 
fits with high fidelity into the head structure of the complex”” 
(Supplementary Fig. 2). After CRISPR cleavage, CasE remains bound 
to the 3’ end of the mature crRNA and the RNA stem-loop protrudes 
like a ‘beak’ from the head of the Cascade complex (Fig. 1 and Sup- 
plementary Fig. 2). The crRNA loops around the base of CasE and 
extends ~45 nucleotides along the binding groove in the helical back- 
bone. The 5’ end of the crRNA terminates within the tail of the com- 
plex, forming a hook-like structure in a pocket between CasC6 (C6, the 
sixth CasC subunit) and CasA (Fig. 1b). CasD is adjacent to this 
interface, sits at the midpoint of CasA and makes extensive contacts 
with the neighbouring domain of C6 (Fig. 1). 

The two CasB subunits form an elongated dimer positioned along 
the inner surface of the crRNA-CasC spine, connecting the head 
(CasE) and tail (CasA) of the Cascade complex. CasB1 sits next to 
the head, and has limited interactions with CasE and the first two CasC 
subunits (C1 and C2). The CasB2 subunit makes similar contacts with 
CasC subunits C3 and C4. The extended conformation of the CasB 
dimer creates a deep cleft that cradles the 3’ half of the crRNA spacer 
sequence (Fig. 1). 

The three-dimensional structure of Cascade reveals how six iden- 
tical CasC polypeptides assemble into an asymmetric helix that is 
programmed to terminate at C6. The first five CasC subunits are 
structurally similar, forming a right-handed helix with a pitch of 
135A (Fig. 2a). However, this symmetry is perturbed between C5 
and C6 owing to the interaction of C6 with the hook-like structure 
at the 5’ end of the crRNA (Fig. 2b). This interaction results ina ~ 160° 
rotation of the distal domain of C6 (Fig. 2). This rotation flips the distal 
domain out of the vertical axis of the CasC helix, breaking the helical 
arrangement. CasD and CasA stabilize this distinct structure and ori- 
entation of C6. The flipped-out conformation of the distal domain in 
C6 results in a larger gap between the C5 and C6 subunits, exposing a 
segment of crRNA in the 5’ region of the spacer sequence. 

Cascade engages invading nucleic acids with high affinity when they 
bear sequence complementary to the 5’ end of the crRNA spacer 
sequence’*'® (the ‘seed’ sequence: nucleotides 1-5, 7-8). Bacteriophages 
containing a single-nucleotide mutation in the seed region escape 
Cascade-mediated immunity, whereas point mutations outside the 


Howard Hughes Medical Institute, University of California, Berkeley, California 94720, USA. Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA. °Life 
Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. “Laboratory of Microbiology, Department of Agrotechnology and Food Sciences, Wageningen University, 
Dreijenplein 10,6703 HB Wageningen, The Netherlands. °Department of Chemistry, University of California, Berkeley, California 94720, USA. °Physical Biosciences Division, Lawrence Berkeley National 


Laboratory, Berkeley, California 94720, USA. 
*These authors contributed equally to this work. 


486 | NATURE | VOL 477 | 22 SEPTEMBER 2011 


©2011 Macmillan Publishers Limited. All rights reserved 


crRNA 
Cascade 


__ Spacer 


—- | 
Be een 


cse1 cse2 cse4 cas5e cse3 


120A 


180° 


St orna 


190A 


Figure 1 | Structure of the Cascade complex from E. coli. a The CRISPR 
system in E. coli K12 (Cse-type) consists of eight cas genes and a downstream 
CRISPR locus. casA to casE are members of large gene families, referred to as 
csel, cse2, cse4, casSe and cse3, respectively**”’. The CRISPR consists of a series 
of 29-nucleotide repeats (black diamonds) separated by 32-nucleotide spacer 
sequences (green cylinders). CasE (magenta) is an endoribonuclease that 
specifically binds to a stable stem-loop in the CRISPR RNA repeat and cleaves 8 
nucleotides away from the spacer sequence in the 5’ direction”'’"*. b, Cascade 
assembles into a sea-horse-shaped architecture where the crRNA (green) is 
positioned along a helical arrangement of six CasC subunits (C1-6). The helical 
spine is capped at its ends by two prominent features that represent the head (E, 
CasE) and tail (A, CasA) of the sea-horse anatomy. D, CasD. c, Cascade consists 
of unequal numbers of Cas proteins and a crRNA (CasA;B,C.D,E,;crRNA)). 
The first five CasC subunits (C1-5) are structurally similar, whereas CasCé is 
distinct. B1, CasB1; B2, CasB2. 


seed region do not facilitate escape’’. Despite these observations, little 
is known about the mechanism of target binding and the extent to 
which base pairing occurs between the crRNA and a target sequence. 
To address these issues, we determined the structure of Cascade bound 
to a 32-nucleotide single-stranded RNA (ssRNA) that is complemen- 
tary to the spacer sequence of the crRNA. Although Cascade is thought 
to target DNA, it has also been shown to bind ssRNA targets with high 
affinity. In vitro, Cascade makes specific and nonspecific interactions 
with double-stranded DNA substrates but interacts with RNA in a 
strictly sequence-specific fashion’*'*. We chose a ssRNA substrate to 
achieve maximal target site occupancy and sample homogeneity. 
Notably, RNA and DNA targets induce similar structural changes in 
the Cascade complex as detected by partial proteolysis (Supplementary 
Fig. 3). 
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Figure 2 | Programmed capping of the CasC helix. a, C1-5 form a right- 
handed helix with a pitch of 135 A. The two domains of each CasC subunit are 
referred to as proximal and distal, relative to the helical axis of the CasC 
subunits. The different conformation of C6 (red) relative to the other CasC 
subunits interrupts the helical symmetry (black arrow). b, The crRNA is 
positioned along a contiguous groove on the concave surface of the C1-5 helix. 
The 5’ end of the crRNA forms a hook-like structure that interacts with C6. This 
interaction correlates with the distinct conformation of C6 and the termination 
of the helix. Although the proximal domains of C5 and Cé have the same 
orientation, the distal domain of C6 is rotated by ~ 160° relative to the other 
CasC subunits (black arrow). The centre of rotation is indicated by a black dot. 
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The ~9 A target-bound structure maintains the sea-horse morpho- 
logy observed for the unbound complex, in which the CasC subunits 
can be superimposed on the unbound structure (Fig. 3a and Sup- 
plementary Fig. 4). However, examination of the other subunits reveals 
several significant differences that occur on target binding (Fig. 3b). 
The width of the crRNA density approximately doubles along the 
entire length of the spacer sequence, suggestive of duplex formation. 
Strikingly, however, the crRNA and target RNA strands do not form 
one contiguous double-stranded helix. Instead, we observe density 
consistent with five short duplex segments, each accommodating four 
or five base pairs of double-stranded RNA (Fig. 3b-e). The helical 
segments are connected by short (1-2-nucleotide) non-helical regions 
that seem to be the contact sites for individual CasC subunits (Fig. 3e). 

In addition to changes in the RNA, we observed a concerted con- 
formational change in the locations and orientations of CasE, CasB 
and CasA. CasE remains bound to the 3’ crRNA stem-loop, and target 
binding results in a clockwise rotation (~ 15°) consistent with a short- 
ening of the crRNA spacer (Fig. 3b). This motion is coupled with 
movement of the CasB dimer, which forms a protein bridge between 
the head (CasE) and the tail (CasA) of the complex (Supplementary 
Movies 3 and 4). The two CasB subunits move ~17 A along the crRNA 
binding groove, towards the tail. CasB2 interacts with a four-helix 
bundle in CasA, inducing a ~30° rotation of CasA. This rotation is 
centred around CasD, which functions as a hinge that connects CasA 
to C6. The distinct orientation of C6 relative to C1-5 is conserved in 
the target-bound Cascade complex (Fig. 3c, d). However, duplex 
formation on target binding seems to alter the interaction between 
Cé6 and the 5’ hook. Base pairing in the crRNA spacer is concomitant 
with a disruption of the hook-like structure at the 5’ end of the cRNA 
and results in a decrease in resolvable density for the distal domain of 
C6 (Fig. 3c, d and Supplementary Movie 5). 

The target-bound structure of Cascade reveals segments of density 
along the length of the crRNA spacer that accommodate short regions 
of double-stranded helix. This structural observation indicates that the 
entire spacer sequence is available for base pairing to a complementary 
target sequence. However, previous genetic and biochemical assays 
have identified a preferred high-affinity binding site in the 5’ seed 
region of the crRNA that is essential for phage protection’. To test 
the relative binding affinities of discrete regions of the crRNA, we 
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Figure 3 | Target binding triggers a concerted conformational change. 

a, Structure of Cascade bound to a 32-nucleotide target RNA complementary to 
the crRNA spacer sequence, at a resolution of ~9 A. b, Removing the CasC 
subunits reveals significant structural differences between the unbound and 
target-bound structures. CasE, CasA and the crRNA from the unbound 
structure are shown as grey volumes. The same subunits from the target bound 
complex are shown in magenta, green mesh and purple mesh, respectively. 
Models of the CasB crystal structure are shown docked into the cryo-EM map 
before (grey) and after (yellow) target binding. Four or five base pairs of double- 
stranded RNA (red) fit with high fidelity into the crRNA density. On target 
binding, CasE rotates by ~15° (magenta arrow), both CasB subunits move 
~17A along the concave surface of the CasC backbone (yellow arrow) and 
CasA rotates by ~30° (purple arrow). The purple asterisks indicate the 
positions of the four-helix bundle on CasA before and after target binding. c, In 
the unbound structure, CasC6 is rotated out of the CasC helix, exposing the 5’ 
region of the crRNA (double-headed arrow indicates density for the single- 
stranded crRNA). d, Additional density corresponding to the target nucleic acid 
is clearly visible in the target-bound complex (double-headed arrows). Base 
pairing in the crRNA spacer disrupts the hook-like structure at the 5’ end of the 
crRNA, and a difference map reveals a significant loss of resolvable density in 
the distal domain of C6 following target binding (red mesh and arrow). e, The 
short segments of double-stranded helices are connected by short, non-helical 
regions located at pinch points of the CasC subunits (white arrows). 


designed a series of 16-nucleotide target DNAs that tile across the 
crRNA in 8-nucleotide steps (Supplementary Fig. 5). Using native 
gel mobility shift assays, we observed high-affinity interactions for 
targets that include the seed region, and that binding affinities decrease 
with increasing steps in the 3’ direction (Supplementary Fig. 5). This 
indicates that each portion of the crRNA spacer is accessible for target 
binding, but that the unique structural context of the 5’ region of the 
crRNA results in a higher binding affinity for this region. A high- 
affinity seed binding site, of approximately the same length, has also 
been observed in other gene silencing systems'®’’. In eukaryotes, 
Argonaute proteins enhance target recognition by pre-ordering the 
microRNA seed sequence in a helical configuration, and we speculate 
that Cascade may use a similar mechanism”. 

Overall, our data suggest a model in which Cascade-mediated sur- 
veillance initially relies on high-affinity binding to the seed region of 
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Figure 4 | A model for pathogen surveillance and signalling by Cascade. 
Efficient surveillance and detection of invading nucleic acids is mediated by 
base pairing in the seed sequence (nucleotides 1-5, 7-8) of the crRNA”’. Duplex 
formation may proceed in the 3’ direction (four curved black arrows), resulting 
ina series of short helical duplexes that shorten the crRNA; this in turn causes a 
concerted conformational change in CasA, CasB and CasE (coloured arrows) 
that coincides with a disruption of the 5’ hook and results in a decrease in 
resolvable density for the distal domain of C6. Cascade binds single-stranded 
and double-stranded substrates. Here we depict the target as a single strand for 
simplicity. 


the crRNA (Fig. 4). Following a seed match, duplex formation then 
proceeds in the 3’ direction along the length of the crRNA in incre- 
ments of four or five base pairs. These helical segments reduce the 
overall length of the crRNA, triggering the concerted conformational 
change that may serve as a signal to recruit Cas3 for target destruction 
(Fig. 4 and Supplementary Movies 3, 4 and 5). 


METHODS SUMMARY 
Cascade preparation. Proteins of the CRISPR system in E. coli (CasA-E) and the 
synthetic CRISPR RNA were co-expressed in E. coli BL21(DE3). Cascade was 
affinity-purified on Strep-Tactin Superflow Plus resin (Qiagen) using an amino- 
terminal Strep-II tag on the CasB subunit. The Strep-II peptide was removed from 
CasB by cleavage with PreScission protease, and the complex was further purified 
by gel filtration. The purified protein was used for native gel mobility shift assays 
using standard methods. 
Cryo-EM and image analysis. Purified complexes were applied to glow- 
discharged C-flats (Protochips Inc.), blotted and plunged into liquid ethane. Data 
were acquired using a Tecnai F20 Twin transmission electron microscope at 
20e A? on a Gatan 4,000 X 4,000-pixel charge-coupled-device camera using 
the LEGINON data collection software’’. Data preprocessing was performed using 
functionalities within the APPION electron microscopy processing environment”. 
The contrast transfer function (CTF) of each image was estimated during data 
collection using ACE2 and CTFFIND”®. Particles were initially selected using a 
difference-of-Gaussians particle picker, extracted using a box size of 280 X 280 
pixels, and classified two dimensionally using the IMAGIC package”. The 
resulting reference-free class averages were used for template-based automatic 
particle selection’. Three-dimensional maps were calculated using an iterative 
projection-matching approach with libraries from the EMAN2 and SPARX soft- 
ware packages””®. Volume segmentation, docking and visualization of molecular 
models were performed using CHIMERA” 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Cascade preparation. The Cas proteins (CasA-E) and the synthetic CRISPR RNA 
were co-expressed in E. coli BL21(DE3) cells that were induced with 0.5mM 
isopropyl-B-D-thiogalactopyranoside at Deoonm=0.5 in overnight cultures 
grown at 16°C (refs 13, 14). Cells from the overnight expression cultures were 
collected by centrifugation (5,000g for 10 min). The cell pellet was resuspended in 
lysis buffer (100 mM Tris, pH 8.0, 300mM KCl, 1mM EDTA, 1 mM tris(2- 
carboxyethyl) phosphine hydrochloride (TCEP) and 5% glycerol), supplemented 
with protease inhibitors (Roche), and the slurry was sonicated on ice for 2 min in 
10-s bursts. The lysate was clarified by centrifugation (22,000g for 20 min) and the 
complex was affinity-purified on Strep-Tactin Superflow Plus resin (Qiagen) using 
an N-terminal Strep-II tag on the CasB subunit. The complex was eluted from the 
resin in 50 ml lysis buffer containing 2.5 mM desthiobiotin. The Strep-II peptide 
was removed from CasB by cleavage with PreScission protease during dialysis at 
4°C overnight against gel filtration buffer (25 mM Hepes, pH 7.5, 100 mM KCl, 
1mM TCEP). The liberated Strep-II tag was removed using a second Strep-Tactin 
Superflow Plus column (Qiagen). The protein was concentrated (Amicon) for 
further purification on a Superose 6 size-exclusion column (GE Healthcare) equi- 
librated in gel filtration buffer. The target-bound complex was prepared by adding 
fivefold molar excess of an oligoribonucleotide complementary to the crRNA. The 
mixture was incubated at 37 °C for 15 min. The unbound oligoribonucleotide was 
separated from the target-bound complex on a Superdex 200 size-exclusion 
column (GE Healthcare). 

Cryo-electron microscopy. Preservation of nucleoprotein complexes in vitreous 
ice was performed in the same manner for both unbound and target-bound speci- 
mens. Aliquots (4 11) of purified sample (~1.2 mg ml‘) were placed onto C-flats 
(Protochips Inc.) that had been just glow-discharged in a nitrogen atmosphere for 
60s using an Edwards carbon evaporator. Grids were loaded into an FEI Vitrobot 
whose incubation chamber maintained an environment of 4 °C and 100% humidity. 
The grids were blotted for 3 s using a blotting offset of —1, and were then plunged 
into liquid ethane and stored in liquid nitrogen until being loaded into the electron 
microscope. Data were acquired using a Tecnai F20 Twin transmission electron 
microscope operating at 120 keV at a nominal magnification of X 100,000 (1.15 A at 
the specimen level) using low-dose exposures (~20e~ A *) with a randomly set 
focus ranging from —0.8 to —2.5 um. A total of 2,370 images of unbound Cascade 
and 1,406 images of target-bound Cascade were automatically recorded on a Gatan 
4,000 X 4,000-pixel charge-coupled-device camera (15-1m pixel size) using the 
LEGINON data collection software’®. 

Single-particle pre-processing. All data preprocessing leading to three- 
dimensional reconstruction was performed using functionalities within the 
APPION processing environment’. Concurrent with data collection, carbon 
edges were manually masked from the acquired images, and particles were initially 
extracted automatically using a difference-of-Gaussians particle picker’'. The con- 
trast transfer function (CTF) was additionally estimated automatically during data 
collection using both the ACE2 program and the CTFFIND program”. Particle 
image stacks were generated by extracting selected particles with a box size of 
288 X 288 (performed with the ‘batchboxer’ program”) from images whose esti- 
mated CTF confidence value was greater than 80%. The stack was reduced by a 
factor of four, and reference-free, two-dimensional classification was performed 
using iterative multivariate statistical analysis and multireference alignment ana- 
lysis (MSA-MRA) within the IMAGIC software package’. The resulting class 
averages showing detailed structural information at a high signal-to-noise ratio 
were selected for use as templates for template-based automatic particle selection 
using FINDEM™, resulting in a total of 498,137 and 389,166 particle selections for 
the unbound and target-bound particles, respectively. Particles were extracted in 
the same manner as previously described, and reference-free, two-dimensional 
classifications were again performed with the MSA-MRA methodology. The 
resulting gallery of 5,000 class averages was manually curated to remove ice con- 
tamination, false positives and damaged Cascade complexes. Another round of 
MSA-MRA was performed on the resulting ‘cleaned’ stack, and the resulting class 
averages were again inspected to remove false or damaged particle selections. Only 
particles contained in the final set of class averages were re-extracted from phase- 
flipped micrographs to generate the final stack for the data sets. The particle 
image stacks, which contained 275,573 and 176,090 particles for the unbound 


and target-bound complexes, respectively, were binned by a factor of two to a 
pixel size of 2.3 A for three-dimensional reconstructions. 

Initial models for three-dimensional reconstruction were determined using a 

low-resolution SAXS reconstruction*’. The SAXS reconstruction was low-pass- 
filtered to a resolution of 60 A and forward-projected at an angular increment of 
15°, anda multireference alignment was performed using the final 5,000 reference- 
free class averages of each of the Cascade complexes. The aligned class averages 
were back-projected to generate a new density model, which was then used for 
another iteration of projection matching. Ten iterations of projection matching at 
an angular increment of 15° were performed using the EMAN reconstruction 
software to arrive at the unbound and target-bound Cascade densities, which were 
used as starting points for refinement using single particles. 
Three-dimensional reconstruction and analysis. The unbound and target- 
bound data sets were processed separately, each using their corresponding initial 
model. Three-dimensional refinements of the starting densities were performed 
using an iterative projection-matching approach with libraries from the EMAN2 
and SPARX software packages”””*. Projection matching began at an angular incre- 
ment of 25°, progressing down to 0.8° over the course of dozens of iterations. The 
reconstruction algorithm dictated that the reconstruction was only allowed to 
proceed to the next smaller angular increment once >95% of the particles had a 
pixel error of less than one pixel. The resolution was estimated by splitting the data 
set into two separate halves and calculating the Fourier shell correlation between 
the resulting volumes. The density was conservatively low-pass-filtered to this 
estimated resolution before proceeding to the next iteration. The estimated reso- 
lutions based on the Fourier shell correlation for the unbound Cascade density 
were 8.8 A at a correlation of 0.5 and 7.7 A ata correlation of 0.143. The estimated 
resolutions based on the Fourier shell correlation for the target-bound Cascade 
density were 9.2 A at a correlation of 0.5 and 8.0 A at a correlation of 0.143. 

To dampen predominant low-resolution amplitudes, the density Fourier ampli- 
tudes of the two final reconstructed densities were adjusted to match experimental 
one-dimensional SAXS curves using the SPIDER software package’. Segmentation 
of the densities was performed manually using the ‘volume tracer’ tool of the UCSF 
CHIMERA visualization software”. UCSF CHIMERA was also used for rigid-body 
docking of crystal structures into the segmented densities, as well as for generation of 
all surface renderings of cryo-EM densities. To assess the difference in position of 
CasCé6 relative to the other CasC subunits, helical models of CasC and the RNA were 
generated by using two components of the iterative helical real-space reconstruction 
method*'. Cryo-EM density corresponding to C1-5, as well as their associated 
nucleic-acid densities, were first segmented from the asymmetric reconstructions. 
A rough estimate of the axial rise and rotation of the subunits was determined 
manually, and were used as initial parameters for the ‘hsearch_lorentz’ program, 
which determined the true axial parameters. The ‘himpose_long’ program was then 
used to impose the helical symmetry on the segmented density, generating the 
helical structure. 

Electrophoretic mobility shift assays. Binding assays were performed by incub- 
ating Cascade with 5’ **P-labelled single-stranded DNAs. Each reaction included 
25mM HEPES, pH 7.5, 100 mM KCl, 1 mM TCEP, 1% glycerol, 1 mM MgCl, and 
1 mg ml’ transfer RNA. All reactions were incubated for 15 min at 37 °C before 
electrophoresis on 6% polyacrylamide gels. Gels were dried and exposed using 
phosphor storage screens, scanned with a phosphorimager (GE Healthcare) and 
quantified using KALEIDAGRAPH (Synergy software). 

Limited proteolysis. Preparations of Cascade were annealed to ssRNA or single- 
stranded DNA substrates complementary to the spacer sequence, spacer plus the 
5’ handle (self) or spacer with a protospacer-adjacent motif. Limited proteolysis 
was performed at room temperature (25 °C) in a total reaction volume of 100 pl. 
Each reaction mixture contained 30 1M trypsin (Sigma), 3.7 1M Cascade, 25 mM 
Hepes, ph 7.5, 100 mM KCL, 5% glycerol and 1 mM TCEP. Aliquots (20 pl) of the 
reaction were sampled at each time point and added directly to X5 SDS-loading 
buffer at 95°C for 5 min. Reaction products were separated by sodium dodecyl 
sulphate polyacrylamide gel electrophoresis using 12% gels. 


30. Frank, J. et al. SPIDER and WEB: processing and visualization of images in 3D 
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Polyamine sensing by nascent ornithine decarboxylase 
antizyme stimulates decoding of its mRNA 


Leo Kurian'}*, R. Palanimurugan'*, Daniela Gédderz't & R. Jiirgen Dohmen! 


Polyamines are essential organic polycations with multiple cellular 
functions relevant for cell division, cancer and ageing’ *. Regulation 
of polyamine synthesis is mainly achieved by controlling the activity 
of ornithine decarboxylase (ODC) through an unusual mechanism 
involving ODC antizyme™, the binding of which disrupts homo- 
dimeric ODC and targets it for ubiquitin-independent degrada- 
tion by the 26S proteasome*. Whereas mammals express several 
antizyme genes’, we have identified a single orthologue, termed 
OAZI1, in Saccharomyces cerevisiae’. Similar to its mammalian 
counterparts, OAZI1 synthesis is induced with rising intracellular 
polyamine concentrations, which also inhibit ubiquitin-dependent 
degradation of the OAZ1 protein’. Together, these mechanisms 
contribute to a homeostatic feedback regulation of polyamines”’®. 
Antizyme synthesis involves a conserved +1 ribosomal frameshift- 
ing (RFS) event at an internal STOP codon during decoding of its 
messenger RNA‘ ”®. Here we used S. cerevisiae OAZ]1 to dissect the 
enigmatic mechanism underlying polyamine regulation of RFS. In 
contrast with previous assumptions, we report here that the nascent 
antizyme polypeptide is the relevant polyamine sensor that operates 
in cis to negatively regulate upstream RFS on the polysomes, where 
its own mRNA is being translated. At low polyamine levels, the 
emerging antizyme polypeptide inhibits completion of its synthesis 
causing a ribosome pile-up on antizyme mRNA, whereas polyamine 
binding to nascent antizyme promotes completion of its synthesis. 
Thus, our study reveals a novel autoregulatory mechanism, in which 
binding of a small metabolite to a nascent sensor protein stimulates 
the latter’s synthesis co-translationally. 

Our dissection of the elements controlling decoding of OAZI 
mRNA involving RFS shows that it is negatively regulated within a 
polyribosome unit by nascent OAZ1 polypeptide, which serves as the 
sensor of polyamines in this system (Supplementary Fig. 1). To 
identify the elements in OAZ1 mRNA important for polyamine regu- 
lation of RFS, we generated constructs that either carried the authentic 
RFS site bearing a TGA codon or a deletion of a T nucleotide in this 
STOP codon, yielding an in-frame fusion (Fig. 1)’. Expression of all 
constructs was monitored in a mutant impaired in proteasomal 
activity to minimize effects of polyamines or the truncations on the 
stability of OAZ1 (refs 7 and 11). Polyamine levels and truncations had 
no significant effects on the abundance of OAZ1 transcripts (Sup- 
plementary Fig. 2). Quantitative western blot analyses were performed 
to determine ‘relative RFS efficiency’ by comparing levels for the con- 
structs carrying the RFS site to those of the corresponding in-frame 
controls. In the absence of added polyamines, the wild-type RFS con- 
struct yielded approximately 3% RFS efficiency (Fig. 1a). Addition of 
10 uM or higher concentrations of the polyamine spermidine to the 
growth media had no detectable effect on the in-frame control but led 
to a ~fourfold induction (~13% RES efficiency) for the RFS construct 
(Fig. la and Supplementary Fig. 3). 

Truncations at the 5’ end of the OAZ1 coding sequence had marked 
effects, yielding RFS efficiencies up to approximately 62%, with a 


gradual loss of polyamine regulation. These observations indicated 
that translation of OAZ1 mRNA bearing the RFS site is inhibited by 
sequences close to its 5’ end (Fig. la). A different scenario was 
observed for the 3’ end of OAZ1 mRNA, where deletions of 57 or 
more nucleotides resulted in a complete loss of polyamine regulation 
(Fig. 1b). In contrast to the 5’ deletions, however, the 3’ deletions 
caused RFS rates (~20%) that were only moderately higher than those 
of spermidine-induced wild-type OAZ1 (~13%). These findings 
demonstrated that a segment of the OAZ1 mRNA downstream of 
the RFS site is essential for negative regulation at low polyamine levels. 
Next, we asked whether the effects observed for 5’ or 3’ truncations 
were synthetic or epistatic. We observed a loss of polyamine regulation 
of all constructs bearing deletions at the 5’ end when combined with 
3'A150. Both in the absence or presence of externally applied poly- 
amines, RFS rates detected for all of these constructs were remarkably 
high (~65%), indicating that 5’ and 3’ truncations had a synthetic 
effect in eliminating negative regulation of RFS (Fig. 1c). Together 
these experiments surprisingly revealed that RFS during decoding of 
OAZ1 is strongly inhibited by OAZ1 elements 5’ and 3’ to the RFS site, 
both of which are also required to sense polyamines. 

To test whether the element provided by the 5’ portion of OAZ1 is 
an mRNA secondary structure motif, silent mutations were introduced 
at all possible positions within the first 24 or 51 nucleotides of the 
OAZ1 coding sequence. These mutations had no effects on RFS, indi- 
cating that mRNA structure motifs in this area are not relevant for 
polyamine sensing (Supplementary Fig. 4a). To investigate whether 
instead the encoded polypeptide is mediating the regulation, we caused 
drastic changes to the polypeptide sequence close to the amino ter- 
minus by introducing two frameshift mutations (Fig. 2a). Despite the 
only small changes to the mRNA sequence, this ‘shift of frames’ variant 
(5’SF) had completely lost negative regulation of RFS in the absence of 
spermidine, indicating that the coding capacity rather than the mRNA 
structure of the 5’ portion of OAZ1 mRNA is critical for the observed 
inhibition of RFS. This notion was further supported by an OAZI1 
mutant that showed constitutive inhibition of RFS unresponsive to 
the addition of polyamines (Supplementary Fig. 5). The underlying 
mutation was a change of an Ile (I) codon at position 5 to a Phe (F) 
codon (I5F). Inhibition of RFS was obtained with both Phe codons 
tested, whereas all other permutations of codon 5 yielding Ile or Leu 
codons had no effect, showing that the relevant parameter is the 
encoded amino acid residue rather than specificity for a certain trans- 
fer RNA. Together, these findings led to the unexpected conclusion 
that the OAZ1 N terminus mediates a negative control of RFS during 
decoding of OAZ1 mRNA. 

Next we asked whether the N-terminal inhibitory element of OAZ1 
could act independently of the polyamine-regulated element that 
depends on sequences downstream of the RFS site. Therefore, we 
generated a set of constructs, in which OAZI1 variants extending only 
until 30 nucleotides downstream of the RFS site were fused to the 
mouse Dhfr open reading frame (ORF) (Fig. 2b). The much lower 
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Figure 1 | Mapping of elements in OAZ1 that regulate its decoding. 

a, Analysis of the effects of OAZ1 5’ truncations. Top, schematic representation 
of constructs used, all of which encoded 2 Myc-tagged OAZ1 variants. 
Truncations were introduced at the 5’ end removing the indicated number of 
nucleotides starting from the ATG start codon. Two versions of each construct 
were generated, one that contained the RFS site with the TGA STOP codon, and 
one in-frame control construct, in which the first nucleotide of this codon was 
deleted. Middle, western blot analysis of Myc-OAZ1 levels in yeast pre1-1 cells 
expressing constructs either with RFS site (“TGA”) or without (‘(AT)GA’), 


RFS rates obtained for the construct bearing an intact ORFI in com- 
parison to the corresponding 5’A48 and 5’SF constructs demonstrated 
that the N-terminal element is a strong inhibitor of RFS even in the 
absence of the sequence 3’ of the RFS site. This notion was further 
supported by the observation that the ISF mutation caused an even 
lower RFS rate in this context. These results therefore clearly demon- 
strated that the N-terminal inhibitory element can operate indepen- 
dently of the 3’ element, and that it is not directly regulated by 
polyamines. Whereas the exact mechanism by which the N terminus 
of nascent OAZ1 inhibits RFS remains to be explored, it is in line with 
an increasing number of examples of nascent polypeptides that inhibit 
their own synthesis’ (see also Supplementary Discussion). 

To characterize the polyamine-regulated element, which depends 
on sequence downstream of the RFS site, we used strategies analogous 
to the ones applied for the characterization of the upstream element. 
Whereas silent mutations affecting the 3’ end of the OAZ1 ORF had no 
effect on RFS (Supplementary Fig. 4b), a mutant version, in which the 
C-terminal eight amino acids are changed by two frameshift muta- 
tions, yielded constitutive RFS similar to constructs bearing carboxy- 
terminal truncations (Fig. 2a). These observations led us to the striking 
conclusion that the polyamine-regulated element also resides in the 
nascent polypeptide. To determine how close to the end of the OAZ1 
ORF this element is located, we individually mutated its last four 
codons to STOP codons (Fig. 3a). Whereas mutation of the last two 
codons had no effect on the decoding efficiency, mutating the third 
(9stop) or the fourth (12stop) codon from the C terminus resulted in 
constitutive RFS. These results indicated that residues very close to the 
OAZI C terminus are critically important for the function of the 
polyamine-responsive element. Perturbation of this element results 
in constitutive RFS with an efficiency that is only slightly higher than 
that of wild-type OAZ1 upon polyamine induction. We note that dras- 
tic depletion of cellular polyamines leads to a reduction in translation 


grown either in the absence or presence of 10 uM spermidine. CDC11 was 
simultaneously detected as a loading control. OAZ1 degradation products are 
indicated by asterisks. Bottom, relative RFS efficiencies (% OAZ1 protein 
obtained with a construct bearing the RFS site relative to the corresponding in- 
frame construct) calculated from quantification of western blot signals. wt, wild 
type. b, As in panel a, but with 3’ truncated OAZ1 constructs. ¢, As in panel 
a, but with OAZ1 constructs combining the indicated 5’ and 3’ truncations. See 
also Supplementary Fig. 1 for an RT-PCR analysis of selected constructs. Error 
bars, s.d.; n = 3. 


and RFS also for OAZ1 constructs lacking these elements, indi- 
cating that translation across the RFS site shows an additional sensi- 
tivity to very low concentrations of polyamines (see Supplementary 
Discussion). 

The experiments described so far revealed the remarkable finding 
that RFS during decoding of OAZ1 mRNA is under negative control by 
distinct elements within the OAZ1 polypeptide. Their inactivation 
resulted in an up to 20-fold increase of RFS at low polyamine levels 
(Fig. 1c). The observation that residues very close to the C terminus of 
OAZ1 polypeptide are required for an inhibitory effect on decoding of 
the OAZ1 mRNA was surprising because this sequence emerges from 
the ribosome only after the RFS event. This raised the possibility that 
the OAZ1 polypeptide influences OAZ1 decoding in trans, that is, after 
its release from the ribosome. Co-expression of an OAZI in-frame 
construct, however, had no effect on the decoding of an RFS reporter 
construct demonstrating that OAZ1 protein does not inhibit RFS in 
trans (Supplementary Fig. 6). These results indicated that the elements 
residing in the nascent OAZ1 polypeptide inhibit RFS in cis within the 
context of an OAZ1 mRNA-polyribosome complex. To determine 
whether there is a correlation of the association of nascent OAZ1 
polypeptides with ribosomes and the efficiency of RFS, ribosomal 
complexes were affinity-purified from a strain bearing a Flag-tagged 
version of ribosomal subunit RPL25 (ref. 13). Strikingly, higher levels 
of incompletely synthesized Myc-OAZ1 polypeptides of various sizes 
were associated with ribosomes in cells grown in the absence of poly- 
amines than in those grown in their presence (Fig. 3b). This result 
illustrated that polyamine induction of RFS promotes completion of 
OAZ1 synthesis and its release from the ribosomes. Consistent with 
this notion, either high or low levels of OAZ1 peptides were pulled 
down with ribosomes, respectively, when mutants were used that 
either repressed or induced RFS constitutively (I5F or cRFS, respec- 
tively, in Fig. 3b). We conclude that inhibition of RFS results in a 
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Figure 2 | Two elements in OAZ1 polypeptide affect decoding of OAZ1 
mRNA. a, Depicted are alignments of wild-type OAZ1 with mutant variants 
carrying nucleotide insertions or deletions (marked in red) to switch reading 
frames, the corresponding encoded polypeptide sequences, a western blot 
analysis of Myc-OAZ1] levels derived from these constructs, and quantification 
of data. SF, shift of frames variant. b, Mutations at the N terminus of the OAZ1 
polypeptide strongly affect RFS independent of the downstream element. Top, 
schematic representation of constructs, in which ORFI sequence followed by 30 
nucleotides of OAZ1 was fused to mouse Dhfr. Ha, haemagglutinin tag. Below, 
western blot analysis of DHFR fusion protein levels and their quantification. 
Mutations correspond to those introduced into full-length OAZ1 shown in 
Fig. 1a, Supplementary Fig. 5, or in (a). Error bars, s.d.;n = 3. 


stalling of ribosomes on OAZI mRNA, whereas induction of RFS 
coincides with a release of full-length OAZ1. The electrophoretic 
mobility of the detected nascent OAZ1 polypeptides indicated that a 
significant fraction of them was associated with ribosomes that had 
already traversed the RFS site. This finding suggested that ribosomes 
with nascent OAZI1 polypeptides pile up on a fraction of OAZ1 
mRNAs at low polyamine concentrations. This is consistent with the 
observation that a polyamine-regulated element extending close to the 
C terminus of the OAZ1 protein operates in cis within a translation 
unit and that low polyamine levels resulted in increased relative levels 
of OAZ1I mRNA in fractions with larger polysomes (Supplementary 
Fig. 7). 

Another noteworthy aspect is that we found low levels of OAZ1 
polypeptides in association with ribosomes when the in-frame OAZ1 
construct lacking the RFS site was used, whether polyamines were 
added to the media or not (Fig. 3b). One plausible explanation for 
the lack of any polyamine effect on this construct is that the responsive 
element initially requires a certain distance between ribosomes to 
acquire its inhibitory function on the completion of OAZ1 translation. 
In this hypothesis, the RFS site causes a pause in translation leading toa 
spacing of ribosomes sufficient for formation of the inhibitory ele- 
ment. Efficient translation of the in-frame construct, in contrast, 
would lead to a ribosome density unfavourable for formation of this 
element. If so, the inhibitory element should also be activated during 
decoding of the in-frame construct when ribosome density is reduced 
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Figure 3 | Nascent OAZ1 polypeptide confers polyamine-regulated 
inhibition of its own mRNA decoding by causing ribosome pile-up. 

a, Truncation of OAZ1 by as little as three residues from the C terminus 
abolishes polyamine regulation of OAZ1 decoding. Left, alignment of wild-type 
OAZ1 with variants carrying nonsense mutations (in red) shortening the 3’ end 
by 1-4 codons. Right and below, western blot analysis and quantification. 

b, Nascent OAZ1 polypeptides accumulate on ribosomes in the absence of 
polyamines. Ribosome pull-down assays were performed with a strain bearing a 
tagged ribosomal subunit (Flag-RPL25). Cells either expressed wild-type 
OAZ1, its in-frame (if) variant, the I5F mutant, or a version carrying a 
frameshift mutation near the 3’ end (leading to a change of the last four residues 
and a C-terminal extension of 16 residues) that causes constitutive RFS (cRES). 
Extracts from cells grown either in the absence or presence of spermidine were 
subjected to anti-Flag pull-down followed by a western blot analysis comparing 
co-precipitated Myc-OAZ1 polypeptides. An extract with full-length OAZ1 
(input) was loaded for comparison. The lower panel shows the analysis of the 
extracts used as starting material. IP, immunoprecipitation. c, Reduced 
translation initiation (RTI) rates promote formation of the polyamine- 
regulated inhibitory element during translation of an in-frame version of 
OAZ1. RTI was achieved by insertion of a sequence expected to form a hairpin 
in front of the START codon. The RTI-3’SF construct (not depicted), in 
addition, carried the frameshift mutations shown in Fig. 2a. Below the 
constructs, a western blot analysis of crude extracts and quantification of OAZ1 
levels are shown. OAZ1 levels are given relative to those for the wild-type 
OAZ1-if construct in the absence of spermidine, which was set to 100%. Error 
bars, s.d.3 2 = 3. 


by lowering the rate of translation initiation”. To test this prediction, 
we inserted a nucleotide sequence expected to form an mRNA hairpin 
structure between the promoter and the translation initiation codon of 
the OAZ1 in-frame construct (Fig. 3c). Similar structures have been 
demonstrated to cause strongly reduced translation initiation (RTI) 
rates’”. In the absence of polyamines, we observed a reduction of OAZ1 
levels to approximately 10% in cells expressing the RTI construct 
compared to the levels obtained with the otherwise identical wild-type 
construct lacking the hairpin. Remarkably, however, when spermidine 
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was added to the cultures, OAZI1 levels obtained with the RTI con- 
struct were induced approximately twofold, whereas levels remained 
unchanged for the wild-type construct. Introduction of the 3’SF muta- 
tion described above, which impairs the polyamine-responsive ele- 
ment, also resulted in a loss of polyamine regulation of the RTI 
in-frame construct, with OAZ1 levels in the absence of polyamines 
rising to those observed in their presence (Fig. 3c). We conclude that 
the same element in nascent OAZ1 is conferring polyamine regulation 
to the RFS reporter and the RTI in-frame construct. When polyamine 
concentration is low, this element, which requires a relatively low ribo- 
some density, can therefore either be activated by translational pausing 
caused by the RFS site or, even though not with full efficiency, by 
reduced translation initiation rates (see also Supplementary discussion). 

Our observation that regulation of RFS during OAZI1 decoding 
involves the nascent OAZ1 peptide prompted us to ask whether the 
OAZ1 protein senses polyamines directly. We previously observed that 
high polyamine levels inhibit ubiquitin-dependent degradation of OAZ1 
protein. It was also noted that OAZ1 has homology to the polyamine- 
binding enzyme spermidine/spermine-acetyltransferase (SSAT)’, which 
was supported by structural similarity between antizyme and acetyl- 
transferases, including SSAT’®. To test for polyamine binding to 
OAZ1, we developed a filtration-based assay using 6His—OAZ1 purified 
from Escherichia coli and radiolabelled spermidine or spermine. 6His— 
OAZ1 was incubated at approximately equimolar concentration 
together with polyamines followed by centrifugal ultrafiltration. 
6His-OAZI1 showed a strong retention of spermidine (~50%) and 
spermine (84%), whereas neither material from mock preparations 
(Supplementary Fig. 8) nor several control proteins showed any values 
significantly above background (Fig. 4). We conclude that OAZ1 spe- 
cifically binds to both polyamines, apparently with a higher affinity 
for spermine. To address whether polyamine binding is a conserved 
property of antizyme, we produced human AZ] as a fusion to maltose 
binding protein (MBP). In contrast to the MBP control, human MBP- 
AZ1 specifically bound spermidine with similar efficiency as yeast 
OAZ1, indicating that polyamine sensing by antizyme is conserved 
from yeast to humans (Supplementary Fig. 9). 

What might be the mechanism by which binding of polyamines to 
nascent OAZ1 regulates its synthesis? The element in the C-terminal 
portion of nascent OAZI could arrest translation during elongation or at 
the termination step. The former would be reminiscent of the regulated 
elongation arrest caused by a specific sequence in nascent E. coli SecM 
(secretion monitor) protein'’”. The observation that residues so close to 
the OAZ1 C terminus that they must still be in the ribosome exit tunnel 
are critical for this arrest to occur, indicates that interactions between 
these residues with inner surfaces of the ribosome contribute to the 
arrest, similar to what has been shown for SecM". Based on our findings, 
we propose a model (Supplementary Fig. 1), wherein nascent OAZ1 
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Figure 4 | Polyamines bind to OAZ1 protein. Shown are the results of 
measurements that determined the retention of radiolabelled polyamines during 
ultrafiltration. Retention of [3H] -spermidine or ['*C]-spermine by 6His-OAZ1 
from three independent preparations was compared to that observed with buffer 
only, with material from Ni-NTA mock preparations from an E. coli strain not 
expressing 6His-OAZ1, as well as with Bacillus subtilis o-amylase («-amy), 
chicken egg lysozyme, bovine serum albumin (BSA), ovalbumin, thyroglobulin 
(TG), and proteinase K (each at 10 1M). Error bars, s.d.; 1 = 3. 
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assumes a conformation that, together with the residues located 
inside the exit tunnel, promotes an arrest of translation at low cellular 
polyamine concentrations, whereas binding of polyamines to OAZ1 
prevents formation of this inhibitory conformation. We found that 
various mutations downstream of the REFS site inactivated the 
polyamine-regulated element (Supplementary Fig. 10), indicating that 
the downstream inhibitory element involves a larger structural domain. 
Formation of this element apparently requires translational pausing at 
the RFS site (Supplementary discussion). In our model, the N-terminal 
inhibitory element and the RFS site are cooperating to achieve the 
required translational pausing resulting in low ribosome density. 

A key finding of our study is that ribosome-associated nascent 
OAZ1 polypeptide acts as a direct sensor of intracellular polyamine 
concentrations. Therefore, this system provides an intriguing novel 
type of co-translational control mechanism, in which a metabolite 
binds to a nascent, ribosome-associated autoregulatory sensor protein 
that regulates the decoding of its own mRNA. Similar types of co- 
translational mechanisms are likely to control the expression of other 
autoregulatory proteins, including metabolic enzymes. One possible 
example is the expression of mammalian SSAT, an enzyme involved in 
polyamine catabolism, which is known to bind polyamines (see 
above). Translation of SSAT mRNA is induced by polyamines and 
depends on sequences close to both ends of its ORF’’. Polyamine 
binding to nascent SSAT might regulate its expression co-translation- 
ally, similar to what we observed for OAZ1, except that RFS is not 
involved. Beyond the control of ODC, antizyme was reported to have 
multiple additional cellular targets with relevance to cancer, such as the 
polyamine uptake system, Aurora A kinase, and the anti-apoptotic 
protein DeltaNp73 (refs 2, 20-22). The identification of antizyme as 
a polyamine sensor therefore may provide an interesting drug target to 
simultaneously stimulate downregulation of polyamine synthesis and 
uptake, as well as downregulation of cell cycle regulators. 


METHODS SUMMARY 

Determination of relative frameshifting efficiency. S. cerevisiae YHI29/1 
(prel-1)'' (a gift from D. H. Wolf) transformants expressing OAZ1 constructs 
either with the RFS site or the corresponding in-frame variants (see plasmid table 
in the Supplementary information) were grown in SD medium supplemented with 
101M spermidine where indicated. RFS efficiency (given in percentage of con- 
centration obtained for the in-frame construct) was calculated as follows. First, 
steady-state signals obtained for OAZ1 expressed from the corresponding RFS and 
in-frame constructs as well as for CDC11 were quantified. Second, the OAZ1 levels 
were normalized for protein loading differences using the corresponding CDC11 
values. RFS efficiency was then calculated from the normalized OAZ1 levels 
obtained with the RFS construct and the corresponding in-frame constructs in 
at least three independent experiments. 

Ribosome pull-down assay. To analyse ribosome-associated nascent OAZ1 
polypeptides, strain LEY1 expressing RPL25-Flag-6His was used. Extracts from 
transformants of this strain encoding 2X Myc-OAZ1 variants were subjected to 
anti-Flag pull-down and western blot analysis. 

In vitro polyamine binding assay. Polyamine binding was assayed with recom- 
binant affinity purified 6His-OAZ1 and [*H]-spermidine or ['*C]-spermine by an 
ultrafiltration protocol. Purified 6His-OAZ1 (~10,1M) was incubated with 
10 uM spermidine or spermine for 1h on ice. Unbound polyamines were then 
separated from bound ones by ultrafiltration. Samples of the filtrate and the 
retentate were subjected to scintillation counting. The resulting values from at 
least three independent measurements were used to calculate the percentage of 
bound polyamines with the formula %Cg = 100 X (Cp -Cp)/Cp; where Cp is the 
concentration of free polyamines detected in the filtrate, and Cg the total concen- 
tration of soluble polyamines in the retentate. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Growth of strains and western blot analysis for determination of RFS effi- 
ciency. Yeast rich (YPD) and synthetic minimal media with 2% dextrose (SD) 
were prepared as described’. YHI29/1 (prel-1)"' transformants expressing OAZ1 
variants from the copper-inducible Pcyp; promoter were grown in SD medium 
supplemented with 100}4M CuSO, (ref. 7). Spermidine (Sigma-Aldrich) was 
added when the precultures were diluted into fresh media. Cells were grown at 
30°C from an attenuance measured at 600 nm (D¢oo) of approximately 0.2 until 
they reached a Deoo of 0.6-0.8. Equal amounts of cells were harvested by pelleting 
them at 4,000g for 5 min. Cell extracts were prepared by boiling the pellets in an 
appropriate volume of loading buffer (62.5mM Tris-HCl, pH 6.8, 2% SDS, 1% 
B-mercaptoethanol, 10% glycerol, and 0.002% bromophenol blue) for 5 min at 
100 °C. Extracts from cells corresponding to 0.5 X Dgoo (~15 |g protein) for the 
in-frame constructs and from 1.0 X Deoo (~30 1g protein) for the RFS constructs 
were used for SDS-PAGE and quantitative two-colour western blot analysis using 
the Odyssey Infrared Imaging System (Li-Cor). Anti-Myc (9B11, Cell Signaling 
Technology) and anti-haemagglutinin (16B12, HISS Diagnostics) monoclonal 
antibodies were used for detecting tagged versions of various OAZ1 proteins. 
Anti-rabbit polyclonal antibody (Santa Cruz Biotechnology) was used to detect 
CDC11, which served as an internal protein loading control. Western blotting 
procedure as well as anti-mouse and anti-rabbit secondary antibodies coupled to 
fluorophores were used as described before’. 

Ribosome pull-down assay. We used a ribosome-tagged strain similar to one 
described by others'*. Strain LEY1, which expressed genomically tagged RPL25- 
Flag-6His, is a derivative of YHI29/1 generated using the integrative plasmid 
pLE133. LEY1 transformants expressing 2xMyc-OAZ1 variants were grown from 
a starting Deoo of 0.2 in a volume of 100 ml selective SD medium to a Dgoo of 
approximately 0.8. Cells were harvested and resuspended in 600 ul of ice cold lysis 
buffer (20mM HEPES, pH7.4, 2mM Mg(CH3COO),, 100mM KCH3COO, 
0.5mM DTT) containing a protease inhibitor mix (complete, EDTA free, from 
Roche), and lysed with an equal volume of acid-washed glass beads (0.4-0.6 mm 
diameter) by vortexing five times for 30s with 1 min intervals on ice. Lysates were 
clarified by an initial centrifugation at 10,000g for 5 min at 4 °C. The supernatant 
was again cleared by centrifugation for 20 min at 10,000g at 4°C. For affinity 
purification of ribosomes, lysates containing 1 mg of total protein were mixed 
with an equal volume of ice-cold 2X binding buffer (100 mM Tris-HCl pH7.5, 
24mM Mg(CH3;COO),, 1mM DTT, 1mM PMSF, 50U ml“! RNAsin). To this 
mix, 100, of anti-Flag M2-agarose affinity resin (Sigma-Aldrich) was added 
followed by an incubation for 4h at 4°C with end-over-end rotation. The resin 
was then washed five times with 1 ml of ice-cold washing buffer (50 mM Tris-HCl, 
pH7.5, 100mM KCl, 12mM Mg(CH;COO),, 1 mM DTT, 1mM PMSF). After 
washing, the bound proteins were eluted by incubating the resin in 50 ul of wash- 
ing buffer containing 300 pg ml ' Flag peptide, and analysed by western blotting. 
Sucrose density gradient fractionation and quantification of OAZ1 mRNA. 
Yeast strain LEY] carrying the plasmid pPM318 was grown in approximately 
300 ml of synthetic medium without leucine and uracil containing 200 1M 
CuS0, at 30 °C. Spermidine (10 uM) was added to the indicated cultures approxi- 
mately 3h before harvesting. Cells were harvested when the cultures reached an 
Doo of 0.6-0.8. Culture volumes corresponding to 100 D¢oo units were centrifuged 
at 2,800g for 5 min at 4 °C. Cell pellets were washed once in 5 ml of ice-cold lysis 
buffer (30mM HEPES-KOH (pH7.4), 100mM KCH;COO, 30mM 
Mg(CH;COO),, 0.5mM DTT, Protease inhibitor cocktail without EDTA 
(Roche), 200 pg ml~' heparin). Cell pellets were resuspended in ice-cold lysis 
buffer (1 ml final volume) containing 5 yl RNasin (Promega). Cells were lysed 
by vortexing five times for 30 s at full speed in the presence of 500 il 0.5-mm glass 
beads with 30 s intervals on ice. The lysates were clarified by three 5 min centrifu- 
gation steps, first at 5,200g, then at 10,600g, and finally at 20,800g. Extracts present 
in 450 ul of the final supernatant were loaded on top of an 11 ml 10-50% w/v 
sucrose gradient prepared in ice-cold gradient buffer (30mM HEPES-KOH 
(pH7.4), 50mM KCH;COO, 12mM Mg(CH;COO),, 0.5mM DTT and 
Protease inhibitor cocktail without EDTA (Roche)). Polysomes were separated 
by centrifugation at 100,000g for 105 min at 4°C in a Beckman SW40Ti rotor. 
After the centrifugation, 750 jl fractions were collected from the sucrose gradient. 
Each fraction was mixed with 2.5 volume ice-cold ethanol and kept at —20 °C for 
15h or longer for RNA precipitation. To isolate total RNA from the fractions, they 
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were centrifuged at 20,800g for 20 min. Supernatants were carefully removed from 
all the samples and the pellets were dried at room temperature for 15 min. After 
drying, the pellets were resuspended in 100 jl RNase-free water and placed on ice. 
RNA from each fraction was isolated using Qiagen’s RNeasy mini kit. RNA was 
eluted from each column with 25 pl RNase-free water. RNA samples isolated from 
each fraction were transferred onto ice immediately after elution. Ten microlitre of 
the eluted RNA samples were used for cDNA synthesis using the High-Capacity 
cDNA Reverse Transcription kit (Applied Biosystems). OAZ1 and TPI1 tran- 
scripts from each fraction were quantified by Q-PCR using TaqMan Gene 
Expression Master Mix (Applied Biosystems) and gene-specific TaqMan probes 
(Eurogentec). 

Affinity purification of 6His-OAZ1. A plasmid expressing codon-optimized 
6His-OAZ1 was constructed based on the pET11a vector (Merck) to obtain 
pDG240. E. coli strain Rosetta (Merck) was transformed either with pDG240 or 
the empty pET11a (mock) plasmid. Expression was induced for 4h at 30 °C with 
1mM IPTG. Cell pellets corresponding to approximately 550 Dgo9 units were 
resuspended in 10 ml binding buffer (50mM Tris, pH7.8 at 4°C). In order to 
purify 6x His-tagged OAZ1, frozen cell pellets were thawed at 25 °C, then 20 mg of 
lysozyme from chicken egg (Sigma-Aldrich), 2 mg of DNase I (Roche) and protease 
inhibitor mix (Roche) were added. Lysis was initiated by vortexing six times for 10s 
at 25 °C followed by incubation on ice for 45 min. Later, the lysate was clarified by 
centrifugation at 25,000g for 30 min at 4 °C. Imidazole was added to the supernatant 
toa final concentration of 20 mM, followed by addition of 250 ll of pre-equilibrated 
Ni-NTA Sepharose (GE healthcare). Binding was carried out by keeping the sus- 
pension at 4°C for 2h with mild rotation. Unbound material was removed by 
centrifugation at 200g for 3 min at 4°C. The beads were washed four times with 
10 ml lysis buffer containing 20 mM imidazole. Bound protein was eluted in 350 ul 
lysis buffer containing 250 mM imidazole for 1 h. Protein concentration was deter- 
mined using the Bradford assay (Bio-Rad). 

Affinity purification of MBP-hAZ1. A plasmid expressing codon-optimized 
human AZ1 (hAZ1) as a fusion to maltose binding protein (MBP-hAZ1) was 
constructed based on the pMAL-c2 vector (New England Biolabs (NEB)) to obtain 
pJD633. E. coli strain Rosetta (Merck) was transformed either with pJD633 or the 
empty pMAL-c2 plasmid. Expression was induced overnight in Overnight Express 
Instant TB Medium (Novagen). Cell pellets were frozen in liquid nitrogen and 
stored at —80 °C. For lysis, pellets were thawed on ice overnight and resuspended 
in 10 ml binding buffer (50 mM Tris, pH7.8 at 4°C). To purify MBP or MBP- 
hAZ1, 10mg of lysozyme from chicken egg (Sigma-Aldrich), 1 mg of DNase I 
(Roche) and protease inhibitor mix (Roche) were added. Lysis was initiated by 
vortexing six times for 10 s at 25 °C followed by incubation on ice for 45 min. Later, 
the lysate was clarified by centrifugation at 25,000g for 30 min at 4°C. Amylose 
resin (200 jl; NEB) equilibrated with binding buffer was added to the supernatant. 
Binding was carried out by keeping the suspension at 4°C for 2h with mild 
rotation. Unbound material was removed by centrifugation at 200g for 3 min at 
4°C. The beads were washed four times with 5 ml binding buffer. Bound protein 
was eluted in 350 pl binding buffer containing 10 mM maltose for 90 min. 
Polyamine binding assay. To establish an ultrafiltration-based binding assay, we 
first tested whether spermidine and spermine passed freely and without retention 
through low protein binding (10K cutoff) centrifugal filtration devices with modi- 
fied polyethersulphone membranes (VWR). Samples (100 pl) were subjected to 
centrifugal filtration at 2,500g for 4 min at 4 °C. These conditions yielded a filtrate 
of approximately 15 pl. Samples (10 1) of the filtrate and the retentate were then 
used to determine the concentration of radioactive polyamines by scintillation 
counting. Retention of polyamines was 5% or less as shown in Fig. 4. To detect 
polyamine binding to proteins, purified 6His-OAZ1, MBP-hAZI, or control 
proteins were diluted to a final volume of 100 jl (~10 1M) using binding buffer 
(50 mM Tris, pH 7.8 at 4 °C) containing 10 1M PH] -spermidine (PerkinElmer) or 
['*C]-spermine (GE Healthcare). After 1h of binding on ice, these samples were 
analysed by centrifugal filtration and scintillation counting as described above. 
The percentage of bound polyamines was calculated with the formula 
%Cz = 100 X (Cp-Cp)/Ca; where Cy is the concentration of free polyamines 
detected in the filtrate, and Cg the total concentration of soluble polyamines in 
the retentate. Control proteins used were B. subtilis x-amylase, chicken egg lyso- 
zyme, bovine serum albumin, chicken egg ovalbumin, bovine thyroglobulin, and 
proteinase K (all from Sigma-Aldrich). 


©2011 Macmillan Publishers Limited. All rights reserved 


IMAGES.COM/CORBIS 


CAREERS 


EUROPEAN UNION A universal patent system 
could ease researcher mobility p.501 


GRADUATE STUDENTS Study seeks to reveal 
career options for aspiring scientists p.501 


NATUREJOBS For the latest career 
listings and advice www.naturejobs.com 


EDUCATION 


Time to teach 


Young scientists want to concentrate on their research, but teaching can bring rewards. 


BY PAUL SMAGLIK 


ostas Pagiamtzis’s experiences as a stu- 
Ke motivated him to be an effective 

teaching assistant. “I was actually a 
big critic of teaching assistants when I was an 
undergraduate,’ says Pagiamtzis. “I had both 
really good ones and really bad ones.” The bad 
ones didn't even seem to be trying. Then he 
had to take on the role himself, while studying 
for a PhD in electrical and computer engineer- 
ing at the University of Toronto in Canada. “I 
thought Id better not be one of the bad ones’, 
he says. At the time, the University of Toronto 
had no formal training for teaching assistants. 


So Pagiamtzis looked to his adviser, his col- 
leagues and the Internet for advice. 

His diligence paid off: Pagiamtzis won three 
departmental teaching awards and gained 
interpersonal ‘soft skills, such as communi- 
cation and time management, that prepared 
him for a career as a microchip designer with 
Gennum in Burlington, Canada. 

Making the time and effort to teach can be 
difficult for young scientists — especially when 
mentors, advisers and other faculty members 
tell them to concentrate on their research. Train- 
ing varies wildly in content and quality. Some 
institutions mandate training only in topics such 
as sexual harassment and ethnic discrimination. 


VOR WANN 
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Others offer voluntary courses on how to teach. 
Some provide course- or topic-specific instruc- 
tion. And a few, such as Emory University in 
Atlanta, Georgia, and, now, the University of 
Toronto, mandate detailed training, in which 
teaching assistants or young instructors learn 
to teach first during discussion sessions with 
small groups of students, then in lab courses 
and, finally, in large lectures. Whether they are 
autodidacts like Pagiamtzis or have had formal 
training like graduate students at Emory, good 
teachers learn the iterative process of prepar- 
ing relevant lessons and presenting informa- 
tion effectively, then assessing the effectiveness 
of their efforts (see ‘Pedagogical pointers’). > 
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EXPERT TIPS 
Pedagogical pointers 


Early-career researchers are often 
unpractised at teaching, and can get 
distracted by their lab responsibilities. 
Here, experienced teachers offer some 
tips to help novices and their students 
get the most out of the classroom 
experience. 


Prepare 

Send your syllabus to your peers for 
feedback. Ask others who have taught a 
similar class to share their materials. 

Set aside time to develop course 
materials — it often takes longer than 
you think. 

Find a mentor whose philosophy and 
teaching style you would like to emulate. 
If possible, visit their classes before you 
begin teaching, to understand how they 
structure time, interact with students and 
promote learning. Talk to your mentor 
about what works and what doesn’t. 

Think about the skills and knowledge 
that you want your students to gain — 
and make sure that you are allowing time 
for your students to practise using them. 


Interact 

Focus less on content mastery than 

on skill mastery. You can’t expect your 
students to think critically in an exam if 
you haven’t asked the same in class. 

Don’t do for students what they can 
and should be doing for themselves. 
Teach them how to find the answers to 
their own questions, either alone or in 
groups. 

Don’t feel that you have to cover every 
topic that falls under the heading of 
your course. What does it matter that 
students know every definition in the 
textbook if they can’t do anything with 
that information? 


Assess 
Make sure to provide students with 
ample feedback, so that they and you 
recognize when they need improvement. 
Make sure you and your students 
have clear, measurable goals. Write 
them down and provide copies to your 
students. Revisit these goals throughout 
the teaching period and assess whether 
you've attained them. 
Be transparent with your students. 
Let them know what you expect, 
what you are doing and why you are 
doing it. Honesty will go a long way 
towards building a successful learning 
community. P.S. 


> — Teaching can benefit an academic’s career 
whether or not it is their main focus. Bounc- 
ing between teaching and research can help to 
identify research questions, improve academic 
writing and hone presentation skills — partic- 
ularly those required for audiences with vary- 
ing knowledge and skill levels. Teaching can 
also be a laboratory in which to learn the soft 
skills that will be vital to a professional career. 


PREPARATION 
One of the most important aspects of teaching 
is also one of the most misunderstood: prepa- 
ration. Many new teachers think that prepara- 
tion means having a basic understanding of the 
course material, but mere familiarity is only the 
beginning. “I aimed to understand the material 
one level deeper than what I was teaching,’ says 
Pagiamtzis. “But go as deep as you can in the 
time you have allotted for preparation” 

To prepare course materials, Diane Ebert- 
May, a plant biologist at Michigan State Uni- 
versity in East Lansing, suggests thinking 
about the core skills or knowledge that teach- 
ers want their students to gain, then reverse 
engineering the syllabus to ensure that pupils 
get the desired benefits. “Then you have to 
practise those competencies with them,” says 
Ebert-May, who also trains biology postdocs in 
scientific teaching through an inter-university 
programme called FIRST IV. 

Students are often told to put in two hours of 
work outside the classroom for every one they 
spend in it; teachers should devote at least as 
much time to their own preparation, says Ebert- 
May. And that doesn't include marking work, 
advising students or other administrative tasks. 

Committing to 
that level of prepara- 
tion means mastering 
time management, 
especially for gradu- 
ate students or post- 
docs doing their own 
research. “Academic 
expectations keep 
going up. There just 
isn't enough time,” 
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her schedule in the 
same way as she does her syllabuses — by set- 
ting goals, then carving out time to meet them. 
What works best, she says, is to set aside blocks 
of time for specific activities: academic writ- 
ing, teaching preparation and correcting her 
students’ work. 

“It gets better; Roark tells new teaching 
appointees. The first two years in an academic 
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job are the toughest; teachers are simultane- 
ously developing curricula, writing grants and 
setting up a lab. “My first year is something I 
don’t want to repeat. You're developing eve- 
rything de novo,’ says Roark. Having become 
established, she is now able to spend a bit less 
time on preparation anda bit more on research. 

Some programmes ease teachers in. For 
example, at Emory, graduate students work 
their way up from supervising lab courses to 
teaching independently, and so have time to 
get used to juggling different aspects of their 
career. They also have the option to focus on 
research rather than on teaching for a semester. 

Although Pagiamtzis didn’t have formal 
training in teaching, he did have an advice net- 
work. His PhD supervisor, Ali Sheikholeslami, 
an electrical engineer at Toronto, recom- 
mended that he ask random students ques- 
tions throughout labs or discussions — not 
to put them on the spot, but to check whether 
they were getting the material. Pagiamtzis also 
looked to other teaching assistants for support 
— for example, someone teaching an earlier 
section of the same course might be able to tell 
him that they had had a particular problem, 
alerting him that he might require extra time 
and attention for his own section. 

Good teaching, like good science, requires 
observation. Novice teachers should watch 
others, then get colleagues and peers to 
observe them and offer feedback, recom- 
mends Emily Rauscher, a postdoc in plant 
ecology at Pennsylvania State University in 
State College, who had some pedagogy train- 
ing and took part in FIRST IV. Many formal 
teacher-training programmes video-record 
practice teaching sessions; people who aren't 
in such a programme can get a friend or col- 
league to record a lecture, then review it with 
them, suggests Sidney Omelon, an engineer at 
the University of Ottawa. 


PRESENTATION 

Part of being an effective teacher involves being 
able to grab students’ attention — even being a 
showman of sorts. Young teachers should look 
at the day’s lesson as a story, with a beginning, 
middle and end, says Pagiamtzis. For example, 
he traces the history of computing from the 
invention of the transistor to the formation of 
technology giants such as Intel and Google by 
talking about how William Shockley, co-inven- 
tor of the transistor, “was a jerk”. Pagiamtzis 
interweaves the story of how transistor tech- 
nology morphed into microchips with tales 
of how Shockley’s abrasive personality drove 
away eight top scientists, some of whom went 
on to form a venture-capital firm that funded 
Google, Amazon and others. Intermingling 
science with the personalities behind it helps to 
hold students’ attention, says Pagiamtzis. Nanda 
Dimitrov, associate director of the Teaching 
Support Centre at the University of Western 
Ontario in London, Canada, agrees. “A lot of 
great researchers know the material very well, 
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Biologist Diane Ebert-May suggests teachers spend twice as long preparing classes as teaching them. 


but do not know how to engage the students,” 
she says. “You need to understand the learner, 
understand the learner's prior knowledge and 
understand how to motivate the learner.” The 
best teachers, says Dimitrov, use various 
approaches, including active learning and 
frequent assessments. That philosophy sums 
up a technique called ‘scientific teaching; 
which builds on the standard lecture format. 

“The notion that ‘If I cover it, they learn it’ 
is fatally flawed? says Ebert-May. Her research 
shows that students retain more when lectures 
are enhanced by interactive lessons and lots of 
feedback (D. Ebert-May et al. Bioscience 47, 
601-608; 1997). The best way for researchers 
to teach science, says Ebert-May, is to treat the 
classroom as ifit were a lab, getting students to 
ask research questions, do literature reviews, 
conduct research, analyse data and present 
results. “You want to have people working 
together to solve complex problems,’ she says. 


EXERCISING THE BRAIN 
Roark uses this approach when teaching 
about how nerves drive muscle-cell function 
in her introductory biology course. She gives 
each student a ‘neuron token with a voltage 
value, then arranges the students into ‘neu- 
ral networks. They must work out whether 
a particular muscle cell in that network will 
contract. “The students have to turn on their 
brains in my classroom,” says Roark. “They 
can't just sit there and take notes.” 
Pagiamtzis likes to challenge his students 
with problems that have unexpected solu- 
tions. For example, as part of the standard 
electronics curriculum, he asks them to 
calculate the level of amplification of a two- 
pole amplifier. They usually use a simplified 
formula called the Miller approximation, 
and most come up with the wrong answer. 
But with enough prodding, students come 
to understand that the usual formula is not 
valid at high frequencies. They will remem- 
ber the lesson better for having discovered it 


for themselves than they would for having 
been taught it directly, says Pagiamtzis. 

Although coming up with challenges 
requires a lot of effort, the work pays off — 
and not just for the students. Pagiamtzis has 
found that searching for special cases and 
exceptions to use in exercises deepens his own 
knowledge and understanding of the sub- 
ject. His experience agrees with the conclu- 
sions of a study published last month, which 
quantitatively shows that teaching helps to 
enhance graduate students’ scientific skill sets 
(D. EF Feldon et al. Science 333, 1037-1039; 
2011). The authors suggest that coming up 
with multiple study designs and research 
premises for use in the classroom honed the 
graduate students’ own thought processes. 

Tobias Langenhan, a physiologist at the 
University of Wiirzburg in Germany, finds 
that teaching and testing his students helps 
him to think about where to put his future 
research efforts, as well as how to refine his 
teaching. “You realize that some of the prin- 
ciples you teach are very well substantiated in 
terms of experimental results and that others 
are not,’ says Langenhan. “Flipping back and 
forth between teaching and research tells me 
where I should invest more time in explaining, 
and also where the pieces in the dogma we are 
trying to explain to the students are missing” 

Not only did Pagiamtzis's classroom expe- 
riences force him to gain technical mastery 
of his subject matter, but the interpersonal 
skills that he learned have been invaluable to 
his industry job. He uses those skills when 
he explains the intricacies of computer chips 
to marketing people, or technical prob- 
lems to managers. An important part of 
that exchange, he says, is being a good stu- 
dent by actively listening. “In essence,” says 
Pagiamtzis, “we are always learning from and 
teaching each other.” m 


Paul Smaglik is a freelance writer in 
Milwaukee, Wisconsin. 
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EUROPEAN UNION 


Single patent system 


The European Union (EU) should 

adopt a universal patent system with 
English as its official language, suggests 

a white paper released by the Charles HI 
University of Madrid on 12 September. 

In The EU Patent System: To Be or Not 

To Be, researchers argue that the existing 
system impedes innovation. Currently, 
patents can be filed in any language, and 
every EU nation has different stipulations, 
legal requirements and costs. Marco 
Giarratana, an associate professor in 
business strategy at Bocconi University in 
Milan, Italy, and a co-author of the report, 
says that a universal system in English 
would encourage innovation by lowering 
translation and other costs. He also argues 
that a shared language for patents would 
boost mobility among young scientists. 


GRADUATE STUDENTS 
Career options clarified 


A new group aims to help graduate 
students to learn about their options for 
scientific and other careers, particularly 
outside of academia. Announced on 

8 September, the Commission on Pathways 
through Graduate School and into Careers 
has been formed by the US Council of 
Graduate Schools and the Educational 
Testing Service. Patrick Osmer, chairman 
of the commission and vice-provost 

for graduate studies at the Ohio State 
University in Columbus, says that the 
group is polling students about their 
knowledge of career options, questioning 
those who have graduated about their 
career paths and asking employers in 
various sectors about their needs. The 
findings will be out in April 2012. 


UNITED KINGDOM 
Home enrolment lagging 


Meagre growth in postgraduate science, 
technology, engineering and maths 
enrolment by UK natives could put 
courses at English universities in long- 
term jeopardy, says a report from the 
Higher Education Funding Council 

for England, out on 9 September. The 
low growth coincides with large rises in 
international enrolment, says the report. 
Any decrease in overseas enrolment 
could threaten the “future viability of 
courses and the overall sustainability of 
these disciplines” by reducing university 
income. But the council says that recent 
rises in native undergraduate enrolment 
should carry over into postgraduate totals. 
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EVERY GIRL DREAMS OF FALLING IN LOVE 


BY SHELLY LI 


very girl dreams of falling in love. 
E When I was five, I decided that my 

future husband would have a smile 
that could brighten a room. 

At twelve, I dreamt ofa husband who was 
tall and slim, with dark, clean-cut hair anda 
well-defined jawline. He would have green 
eyes that reminded me of evergreen trees. 

At sixteen, I wanted to date a guy with 
blonde curls that swept over his eyes. He 
would be a witty but tortured soul, perhaps 
a poet. He would be shy around everyone 
else, but with me, he would open 
up and make me laugh, and he 
would make my skin tingle when 
he kissed me. 

But when I entered the dating 
world, the faces of all three of the 
men I once thought I would marry 
began to fade into the ceiling of my 
empty bedroom. 

The future never turns out the 
way we think it will: the cruellest 
thing God ever gave children was 
the gift of imagination. 

Ten years passed, twenty years, 
and still, Inever once got the chance 
to answer the question: “Will you 
marry me?”. 

To this day, the closest I’ve ever 
got to a marriage proposal was 
when a man in a suit knocked on 
my door and asked: “Will you serve 
your country?” 

This was five days after the 
bombing of the railroad out of 
the Wisconsin River Basin. Like everyone, 
I wanted to help in any way, and so Iagreed. 
Three days later, Bobby came to me. Bobby 
was a programmed war machine, the new- 
est and the most human-like model ever 
designed. 

“Introduce him to the neighbours,” said 
the man. “Say the two of you eloped. You 
don't have to do anything except acknowl- 
edge your relationship with him, if ever 
questioned. He'll come home only in the 
winter, when he needs a place to hide out” 

Bobby looked like a classic American 
hero, straight off the poster. His hair was cut 
short, and his ears stuck out. His eyes were 
hazel, and he had big cheeks, and when he 
smiled, dimples appeared on his face. 

He was pretty laid back when we were 
alone, but around the neighbours, he was 
plenty charming, and the story we fabricated 
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went off without a hitch. The way he looked 
at me when other people were watching ... it 
actually made me feel loved. 

I knew that it wasn't real love. Bobby was a 
robot who went off and fought in nightmar- 
ish places when the neighbours thought — 
and I pretended — that he was going on a 
business retreat. 

Living a lie like that, with a husband who 
was not even a real man, initially I thought 
Id be bothered. But when he was away, and I 
sat around with the other married women on 
the porch, the lie became the reality. 

Were all living a lie in our own way, after all. 


When Bobby came home in the winter, 
there would never bea scratch on him. The 
factory repaired him before they let him 
return, so that no one would see the cuts 
down his back, or bullet dents up and down 
his leg. 

In this sense, I learned so much from 
Bobby. He would tell me stories, when 
we were alone, if I asked. “Bobby’s Bed- 
time Tales,” I would jokingly call them. He 
laughed because he was programmed to 
laugh when I was laughing, but he didn’t 
understand humour. 

I was fascinated by the things he couldn't 
understand. Sometimes, I would even 

watch him sleep at 


NATURE.COM = night. Bobby always 
Follow Futures on slept soundly, even 
Facebook at: though sleep was prob- 
go.nature.com/mtoodm © ably as foreign to him 
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as gunfire was to me. How could I not love 
Bobby, even if he couldn't love me back? 
Bobby would always be the unrecognized 
hero of the country, the robot that carried 
out the tasks that humans did not have the 
stomach to perform. 

More years passed and, like clockwork on 
every December 15th, a taxi would pull up 
to the driveway, and Bobby would emerge 
with a big suitcase. He would pretend it was 
heavy as he lugged it into the house, but it 
was always empty when he opened it. 

The routine was something I had come 
to accept, no matter how lonely I felt in the 
springtime, when I saw husbands 
helping their wives clean out the 
garage. 

It wasn't Bobby’s strength that I 
missed. I could mow my own lawn. 
I could throw my own steaks on the 
grill. I could make my own repairs 
on the house. 

It was his company I missed. I 
missed hearing the breathing of 
another person, before I drifted 
off to sleep. 

Nevertheless, no matter how 
many times I whispered under my 
breath for him to come back, be 
it spring, summer or autumn, he 
never returned. I kept telling myself 
that it was nothing I could change. 
When I signed up for this, I signed 
up with the desire to help my coun- 
try. Somewhere along the way, that 
desire began to fade just as the faces 
of my fairytale husbands faded. 

One day, I was trimming the 
tree in the front yard, the one that was tall 
enough to brush the second-floor windows. 
The leaves provided a lot of shade in the 
summer, but it was now autumn, and time 
for the branches to go. 

Iclimbed to the top of the tree and reached 
out to clip offa branch. Somehow, my foot 
slipped, and before I knew it, the grass was 
zooming towards me. 

I stuck out my hands to break the fall, but 
I ended up knocking into something else 
before hitting the ground. 

Gasping, I blinked and looked up at the 
man whose arms in which I was cradled. 

“Be careful? Bobby said witha smile. mSEE 
COMMENT P.399 


Shelly Li’ online home is www.shelly-li. 
com. Her first novel, The Royal Hunter, will 
be out this autumn, from Philomel Books. 
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BRIEF COMMUNICATIONS ARISING 


Regulation of Caenorhabditis elegans lifespan 


by sir-2.1 transgenes 


ARISING FROM H. A. Tissenbaum & L. Guarente Nature 410, 227-230 (2001) 


Tissenbaum and Guarente' identified the first metazoan Sir2 homologue 
shown to affect lifespan, Caenorhabditis elegans sir-2.1. Independent 
transgenic lines harbouring extrachromosomal DNA arrays containing 
sir-2.1 and the dominant transgene marker rol-6(su.1006) were reported to 
extend mean lifespan between 15% and 50%’. Similar extensions in mean 
lifespan were also found for lines in which the sir-2.1 transgenic arrays 
were integrated into the genome following y-irradiation’. However, the 
extension of lifespan was overestimated in a high-copy sir-2.1 transgene- 
containing worm strain because of an unlinked mutation. 

One long-lived integrated transgenic line, LG100 gels3 (formerly 
known as geIn3, ref. 1), was out-crossed four times to wild-type N2 
worms along with gels101, an integrated rol-6(su 1006) transgene control 
allele, creating LG269 and LG268, respectively. LG269 has a 50% 
increase in mean lifespan relative to LG268 (Fig. 1). Subsequently, we 
were made aware of a mutation in LG100 that prevented dye filling 
(Dyf—) of sensory neurons (exposed to the environment) with the 
amphipathic fluorescent dye DiO (S. Lee, personal communication). 
Dyf-— is indicative of a sensory neuron defect and mutations that confer 
sensory neuron defects can extend lifespan’. By chance, we found 
LG269 to be Dyf—, prompting us to outcross the strain two more times 
thus segregating the Dyf— mutation from gels3. The resulting strain, 
LG389 (geIs3 Dyf+), has a shorter mean lifespan relative to its Dyf— 
progenitor, LG269, but is still long-lived (9.7% P< 0.0001) relative to 
the similarly outcrossed control strain LG390 gels101 (Fig. 1). Further 
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Figure 1 | Integrated transgene gels3, expressing sir-2.1 from its intergenic 
promoter (P;,,), extends C. elegans lifespan following outcross of a linked 
dye-filling mutation. Strains LG269 gels3[P;,;sir-2.1(+) rol-6(su1006)] and 
LG268 gels101[rol-6(su1006)] were outcrossed to wild-type N2 worms four 
times (four outcrosses) before the discovery of a dye-filling mutation in LG269 
(not present in LG268). Both lines were further outcrossed four times, 
uncoupling the dye-filling (Dyf—) mutation from the gels3 transgene in the 
sixth outcross (six outcrosses). Lifespan assays were performed as described 
previously’, Kaplan—Meyer survival curves are presented along with expression 
levels of sir-2.1 transcript relative to actin in the various strains as determined 
by quantitative reverse transcription PCR from two independent samples of 
each strain performed in duplicate (right inset), standard error of the mean 
reported. Lifespan of the eight outcross strains represents the combined 
survival data of four independent segregants of the eighth outcross for both the 
gels3 and gels101 transgenes. No statistical lifespan differences were found 
among the four independent gels3 or gels101 lines from the final outcross. 


outcross of gels3 resulted in lines (LG391-LG394) with a 14.3% 
(P<0.0001) increase in lifespan compared to similarly outcrossed 
control lines (LG395-LG398) (Fig. 1). All strains harbouring geIs3 show 
increases in sir-2.1 mRNA of 10-30-fold (Fig. 1, right inset). Outcrossed 
strains (LG433-LG434) containing the Dyf— mutation, but not gels3, 
were also long-lived (10-16%) (Fig. 1, left insert), indicating that both 
alleles probably contributed to the larger extension seen in the gels3 
Dyf-— strains LG100 and LG269. 

We now know that sir-2.1 is the second gene in an operon with 
R11A8.5. Its expression is regulated by two different promoters: Pint, 
an intergenic promoter that regulates expression of sir-2.1 in the hypo- 
dermis and nerve cells, and Po, an upstream operon promoter that 
regulates its expression in muscle cells and intestine’. The original sir- 
2.1 transgenic lines’, geIs3 and geEx1-3, contain only 500 bases of DNA 
upstream of the start of sir-2.1 and therefore lack Po. An independently 
derived low-copy sir-2.1 transgenic line containing Po was reported to 
extend lifespan by 26%*—an extension that was abolished by sir-2.1 
RNA interference’. In conclusion, our original paper’ overestimated the 
extension of lifespan in a high-copy sir-2.1 transgene-containing worm 
strain due to an unlinked mutation. However, we still observe 10-14% 
lifespan extension by sir-2.1 overexpression in the absence of the 
unlinked Dyf— mutation. Finally, lifespan extension by sir-2.1 over- 
expression can be greater in strains bearing sir-2.1 in the context of its 
operonic promoter P,*”. 
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Following separation from gels3 in the sixth outcross the Dyf— mutation was 
outcrossed to N2 one additional time (seven outcrosses); two independent 
Dyf— segregants were tested for lifespan along with a Dyf+ sibling control and 
N2 worms were used for the outcross. Four outcrosses: LG268 gels 101, n = 197, 
m= 16.7 days; LG269 gels3 n = 222, m = 25.1 days (50.2%), P< 0.0001. Six 
outcrosses: LG390 gels101, n = 287, m = 17.6 days; LG389 gels3 n = 278, 
m= 19.3 days (9.7%), P< 0.0001. Eight outcrosses: LG395-LG398 gels101, 
n= 511, m= 17.7 days; LG391-LG394 gels3 n = 517, m = 20.3 days (14.3%), 
P<0.0001. Left inset: N2, n = 165, m = 16.1 days; LG435, n = 175, 

m = 16.7 days, P = 0.1476 (relative to N2); LG433, n = 129, m = 19.4 days 
(16.2%), P< 0.0001 (relative to LG435): LG434, n = 146, m = 18.7 days (12%), 
P= 0.0002 (relative to LG435), P = 0.8818 (relative to LG433). n = number of 
worms tested, m = mean day of survival, parenthetical values are the per cent 
change in mean lifespan relative to control. P values are relative to outcross 
control unless otherwise stated. Note: all transgenic strains listed in Fig. 1 have 
been deposited at the Caenorhabditis Genetics Center. 
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Note added in proof: A new report from the laboratories of S. Lee and C. 
Murphy independently confirms that our low-copy sir-2.1 overexpres- 
sion strain has an extended lifespan that is abolished by sir-2.1 RNA 
interference’. This report also reaffirms the conclusion that sir-2. 1 inter- 
acts with DAF-16 (ref. 5) by showing that more than 1,000 DAF-16- 
regulated genes are upregulated in the sir-2.1 overexpression strain. 
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Crystal structure of nucleotide-free 


dynamin 


Katja Faelber', York Posor?*, Song Gaob?*, Martin Held**, Yvette Roske!*, Dennis Schulze!, Volker Haucke’, Frank Noé? 


& Oliver Daumke!* 


Dynamin is a mechanochemical GTPase that oligomerizes around the neck of clathrin-coated pits and catalyses vesicle 
scission in a GTP-hydrolysis-dependent manner. The molecular details of oligomerization and the mechanism of the 
mechanochemical coupling are currently unknown. Here we present the crystal structure of human dynamin 1 in the 
nucleotide-free state with a four-domain architecture comprising the GTPase domain, the bundle signalling element, 
the stalk and the pleckstrin homology domain. Dynamin 1 oligomerized in the crystals via the stalks, which assemble in a 
criss-cross fashion. The stalks further interact via conserved surfaces with the pleckstrin homology domain and the 
bundle signalling element of the neighbouring dynamin molecule. This intricate domain interaction rationalizes a 
number of disease-related mutations in dynamin 2 and suggests a structural model for the mechanochemical 
coupling that reconciles previous models of dynamin function. 


Dynamin, the founding member of the dynamin superfamily, is a 
100-kDa mechanochemical enzyme (Fig. 1a) involved in the scission 
of clathrin-coated vesicles from the plasma membrane’. The brain- 
specific isoform dynamin 1 mediates uptake of synaptic vesicles in 
presynaptic terminals**, whereas a function of dynamin 3 at the post- 
synaptic density has been described’. Dynamin 2 is ubiquitously 
expressed®, and mutations in its middle domain (MD), pleckstrin 
homology (PH) domain and GTPase effector domain (GED) are linked 


6 33 293 314 321 499 


518 631 
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to human diseases, for example, rare forms of centronuclear myopathy 
and Charcot-Marie-Tooth peripheral neuropathy’. Upon recruitment 
via the carboxy-terminal proline-rich domain (PRD), dynamin oligo- 
merizes into helical structures around the neck of budding vesicles and 
catalyses vesicle scission in a GTP-hydrolysis-dependent manner*’. 
Different mechanisms for the scission reaction have been proposed, 
including GTP-hydrolysis-dependent constriction’, extension'’ and 
twisting? of the vesicle neck. Other models suggest that the GTP-bound 


Figure 1 | Structure of nucleotide- 
free human dynamin 1. 

a, Structure-based domain 
architecture of human dynamin 1. 
The classical domain assignment is 
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indicated below. b, Ribbon-type 
representation of human dynamin 1. 
Regions not resolved in the crystal 
structure are indicated by dotted 
lines. Domains, distinct secondary 
structure elements and N and C 
termini are labelled. Lipid-binding 
residues are indicated as o. 


GED 


PRD 
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dynamin oligomer induces hemifusion of the inner membrane leaflet 
followed by complete membrane scission after GTP-hydrolysis- 
dependent release’*'’. To resolve the detailed molecular mechanism, 
high-resolution structural data for full-length dynamin are required 
which, to date, are available only for the isolated PH domain‘*”* and 
the GTPase (G) domain’®’”. Low-resolution electron microscopy 
reconstructions of dynamin oligomers showed that nucleotide binding 
leads to constriction of helical assemblies through rearrangements in 
the stalk region composed of the MD and GED". Furthermore, 
G domain dimerization via a conserved interface across the nucleo- 
tide-binding site was shown to mediate the stimulated GTPase activity”. 
We recently described the structure of the stalk of the dynamin-like 
antiviral myxovirus resistance protein 1 (MxA) GTPase and elucidated 
its mode of oligomerization, which involves three distinct interfaces and 
two loop regions in the stalk”. Using this information, we succeeded to 
determine the structure of dynamin 1. 


The structure of human dynamin 1 


We reasoned that the propensity of dynamin to oligomerize at high 
protein concentrations might interfere with the formation of protein 
crystals. On the basis of our previous MxA study”, we assayed a num- 
ber of mutants in a human dynamin 1 construct (amino acids 6-746, 
Fig. 1a) for oligomerization defects. Indeed, a five-amino-acid exchange 
(IHGIR395-399AAAAA) in a conserved motif mapping to loop L2 of 
the MxA stalk’® interfered with higher-order assembly and resulted in a 
monodisperse dimeric dynamin 1 species (Supplementary Fig. 1, see 
also ref. 21). Crystals of a construct containing additionally the K562E 
mutation were obtained in the absence of nucleotides and diffracted to 
a maximal resolution of 3.7 A (Supplementary Table 1). The structure 
was solved by molecular replacement and refined to Rwork/Rfree of 
28.4%/33.5% (Supplementary Table 1). To verify the sequence, 
the positions of 19 internal methionines were assigned by a single 
anomalous dispersion approach (Supplementary Fig. 2). 

Dynamin 1(APRD) has a four-domain architecture, composed of the 
G domain, the bundle signalling element (BSE), the stalk and the PH 
domain (annotated as superscript G, B, S and P, respectively), which does 
not strictly follow the sequence-derived domain boundaries (Fig. 1 and 


Predicted interface-3 


Figure 2 | The dynamin 1 dimer. In the crystals, stalks were arranged in a 
criss-cross fashion via crystallographic two-fold axis (black dotted lines). 
Assembly via the central interface-2 leads to an extended dynamin 1 dimer. 
Black rectangles indicate stalk interfaces shown in detail in the insets (see 
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Supplementary Fig. 3). The structure of the amino-terminal G domain is 
very similar to that of the isolated nucleotide-free G domain” (root- 
mean-square deviation (r.m.s.d.) of 1.4 A for 287 Ca atoms) and shows 
a curved central B-sheet surrounded by «-helices at both sides. The two 
switch regions known to mediate nucleotide-dependent conformational 
changes and the cis stabilizing loop, involved in G domain dimeriza- 
tion’’, are partly disordered. At the N and C termini of the G domain, 
helices «1 and «2°, together with «3° from the C-terminal part of the 
GED of the same molecule, form a three-helix bundle, the BSE”? (Fig. 1 
and Supplementary Fig. 4). Compared to the previously described G 
domain-BSE construct in the GDP*AIF, -bound form’, «1° is shifted 
by two turns relative to «2°/a3" (Supplementary Figs 2 and 4), whereas it 
interacts with the G domain in a similar fashion in both structures. 

At the C-terminal end of 2", the BSE connects to the stalk of 
dynamin 1. The stalk is composed of a four-helix bundle where three 
helices, «1°-013°, are provided by the MD and «4° by the N-terminal 
part of the GED (Fig. 1 and Supplementary Fig. 5a). «1° in dynamin 1 is 
subdivided into «1N®, «1M° and «1C® by two disordered loops, LINS 
and L1C®, compared to a single break of the corresponding helix in 
MxA. Furthermore, «3° in dynamin 1 is extended by a highly con- 
served loop L4°. At the C terminus of the stalk, 04° closely packs against 
o1°-93° via hydrophobic contacts and leads the polypeptide chain 
back to the BSE. Despite an overall sequence identity of only 16%, 
the architecture of the dynamin 1 and MxA stalk is remarkably similar 
(r.m.s.d. of 2.6 A for 160 aligned Co atoms, Supplementary Fig. 5). 

The PH domain is interconnected between «3° and «4° of the stalk 
by two disordered loops, L1°” and L2*", and shows only minor devia- 
tions from the isolated PH domain of dynamin 1 (refs 14, 15; r.m.s.d. 
of 0.8 A for 102 Ca atoms). The three lipid binding loops”*** are only 
partially resolved and point towards the solvent (Fig. 1b and 
Supplementary Fig. 3). 


Dynamin assembly via the stalk 

Similarly to the MxA stalks”, the dynamin 1 stalks in the crystals were 
arranged in a criss-cross fashion resulting in a linear stalk filament. 
The highly conserved symmetric interface-2 of 1,200 A? is located in 
the centre of the stalk (according to the MxA nomenclature, Fig. 2 and 


Predicted interface-1 


Linear 


filament G domain 
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Supplementary Fig. 2 for PH domain assignment). Disease-related dynamin 2 
mutations causing centronuclear myopathy or Charcot-Marie-Tooth’ are 
represented by pink spheres, with the amino acid exchange indicated. 
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Supplementary Figs 3 and 5). Assembly via this interface results in an 
extended dynamin dimer that serves as building block for dynamin 
oligomers*®. Shape and dimension of this dimer agree well with a 
small-angle X-ray scattering study”. 

We previously showed that a second hydrophobic interface in the 
MxA stalk, interface-1, mediates assembly of higher-order oligomers”. 
In dynamin 1, however, the stalks do not contact each other directly at 
the predicted interface-1 (minimal distance 4.5 A, Supplementary 
Fig. 5). This difference is caused by a 5° tilt of the dynamin 1 stalks 
relative to the stalk axis. The hydrophobic nature of this surface in 
dynamin 1 and its conservation in the dynamin family (Supplementary 
Fig. 5c) is indicative of a similar function as in MxA as an oligomeriza- 
tion site. Closure of interface-1 might induce a pitch in the dynamin 
assemblies leading to helical oligomers rather than to ring-like struc- 
tures as in MxA”*. 

L2° containing the disruptive IHGIR395-399AAAAA mutation 
and LIN® are not ordered in the linear dynamin 1 oligomer 
(Supplementary Fig. 5). The corresponding loops in the MxA stalk 
form a third interface (interface-3) which also contributes to the 
assembly of oligomers (Fig. 2). Accordingly, mutations in both loops 
in dynamin 1 (ref. 21), MxA”® and in L1 of dynamin 1-like protein’”””° 
prevent oligomerization. Interestingly, Ser 347 and Tyr 354 in loop 
LIN® in dynamin 1 are phosphorylated in vivo*'*? and might control 
the assembly status. 


Interactions of the BSE 


The BSE interacts with the central -sheet of the G domain via a mostly 
hydrophilic interface of 1,100 A? (Supplementary Figs 4 and 6). In 
contrast, the BSE and concomitantly the G domain are only loosely 
associated with the stalk of the same molecule via loops L1°° and L2”° 
constituting a flexible hinge (hinge 1), as observed in other dynamin 
related proteins’ (Figs 1b and 2, and Supplementary Fig. 2). 
Interesting! ly, Asp 744 at the C terminus of «3° of the BSE contacts 
Arg 440 in «2° of the neighbouring, parallel dynamin 1 stalk (Fig. 3a). 
A similar intermolecular interaction mediates oligomerization and 
the antiviral function of MxA (S.G., K.F., O.D., unpublished obser- 
vation). We tested the importance of this contact experimentally. The 
wild-type dynamin 1 construct bound efficiently to liposomes result- 
ing in an approximately 200-fold stimulation of GTPase activity (Sup- 
plementary Fig. 7). The single R440A and D744A mutants behaved as 
wild type in these assays. To analyse the role of these residues for cla- 
thrin-mediated endocytosis, dynamin 2-eGFP mutants (a fusion of 
dynamin 2 with enhanced green fluorescent protein) were re-expressed 
in HeLa cells depleted of endogenous dynamin 2. Both R440A and 
D744A mutants localized similarly as wild-type dynamin 2 to the plasma 
membrane (Supplementary Fig. 8), but transferrin internalization was 
increased (Fig. 3b and Supplementary Fig. 9). Thus, the salt bridge has 
an inhibitory and/or control function in dynamin-based endocytosis. 


The stalk—PH domain interface 


oL1M° of the stalk forms conserved surface of 370 A? with the PH domain 
(Figs 1b and 2, and Supplementary Figs 3 and 5c). Interestingly, 19 unique 
mutations causing centronuclear myopathy or Charcot-Marie-Tooth 
disease’ cluster in the stalk or the PH domain of dynamin 2, but none 
localizes to the G domain or BSE (Fig. 2). For example, mutations 
E368K/Q and R369W/Q in the stalk and A618T and S619L/W in 
the PH domain are directly in or in close vicinity to the interface 
between the two domains. Mutations A618T and S619L/W lead to 
increased oligomerization rates of dynamin in solution”, suggesting 
that the stalk-PH interface controls oligomerization. In gel filtration, 
the disease-related stalk mutant E368K also eluted as high molecular 
weight species. Consistently, this mutant showed a 20-fold increased 
basal GTPase rate, whereas the liposome-stimulated GTPase reaction 
and transferrin uptake were unchanged (Fig. 3b and Supplementary 
Fig. 7). In contrast, the R369W mutant behaved as wild-type dynamin 
in biochemical and endocytosis assays. Mutations at the periphery of 
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Figure 3 | Stalk interactions with the BSE and PH domain. a, Top view on 
the dynamin 1 oligomer. The PH domains are not drawn for clarity. The insert 
shows a close view of the intermolecular BSE-stalk interaction. b, HeLa cells 
depleted of endogenous dynamin 2 by short interfering RNA (siRNA) were 
transfected with a plasmid encoding siRNA-resistant dynamin 2-eGFP and 
allowed to endocytose fluorescently labelled transferrin. In transfected cells, 
fluorescence was quantified and normalized to mock-treated cells expressing 
eGFP. Data shown represent mean + standard error; *P < 0.05; **P < 0.01; 
***P <().0001 for wild-type versus mutant dynamin 2-eGFP, as determined 
by f-test. 


this interface (S619L, L621D) also did not compromise dynamin 
2-based endocytic activity (Fig. 3b), indicating that more subtle 
changes lead to the disease-phenotype. Interestingly, the F372D 
mutant in the centre of the interface showed significantly higher trans- 
ferrin uptake compared to wild-type dynamin 2, pointing also to an 
inhibitory and/or control function of this interface for dynamin-based 
endocytosis. 


Conformational changes during dynamin assembly 


Based on the isolated BSE and PH domain (Fig. 1b), the stalk dimer 
(Fig. 2), the GDP*AIF, -bound G domain dimer’’, and electron 
microscopy reconstructions of oligomerized dynamin 1 in the con- 
stricted state’*, we generated a molecular model for self-assembly of 
dynamin into helical structures (Fig. 4a and Supplementary Fig. 10). 
The resulting dynamin dimer spans a length of more than 260 A and 
covers 95° of a dynamin ring, thereby placing G domain and BSE 
above the neighbouring stalk. According to this model, oligomeriza- 
tion of dynamin proceeds along the central stalk, whereas the G 
domains mediate contacts between neighbouring turns (see ref. 20 
for a similar MxA model). 

When comparing the linear structure of dynamin 1 in the crystal 
with the helical assembly, large-scale domain movements are apparent 
(Supplementary Fig. 11). The G domain-BSE unit is shifted and rotated 
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Figure 4 | Model for dynamin oligomerization and function. a, Model of the 
oligomerized dynamin helix in the constricted state, in top and side view (see 
also Supplementary Fig. 10). Three dimers (1-3) are uniformly coloured. 
Whereas 13 stalk dimers complete one turn, the G domain of dimer (i) 
associates with the G domain of dimer (i + 10). b, Structure-based illustration 
for the proposed mechanism of the dynamin oligomer. Variations in the 
assembly of consecutive dynamin molecules lead to dynamin helices of 
different rise and diameter. For further explanation, see main text and 
Supplementary Fig. 12. 


around hinge 1 upon oligomerization. Additionally, the G domain 
separates from the BSE by a rotation around the invariant Pro 32 
and Pro294 (Supplementary Fig. 6). The corresponding residues 
Gly 68 and Gly 309 in BDLP**** and Pro 342 in atlastin***” have also 
been suggested to act as a hinge (hinge 2). Integrity of the G domain- 
BSE interface is crucial for the function of dynamin, as indicated by the 
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aggregation of the 110D interface mutant in vivo (Supplementary Figs 6 
and 8). Furthermore, I10D behaved as a dominant-negative mutant in 
transferrin uptake assays (Fig. 3b). Moreover, the QD17-18AA muta- 
tion rendered dynamin 2 largely inactive in endocytosis assays, despite 
localizing correctly to clathrin-coated pits at the plasma membrane 
(Fig. 3b and Supplementary Fig. 8). 

The PH domains also undergo a pronounced rearrangement, to a 
position below the stalk, with the lipid-binding loops*** oriented 
towards the membrane (Supplementary Fig. 11a). We suggest that the 
stalk PH domain interface is disrupted by binding of the PH domain to 
high-affinity, phosphatidylinositol-4,5-bisphosphate-containing mem- 
branes such as the plasma membrane***, thereby promoting dynamin 
oligomerization. 

Finally, we suggest that rotation of stalk dimers via the flexible 
interface-1 and/or interface-3, leaving interface-2 unchanged, leads 
to bending of the linear oligomer and allows helix formation (Sup- 
plementary Figs 11b and 12). Interestingly, the IHGIR395- 
399AAAAA mutation in interface-3 prevented liposome binding 
and, consequently, the liposome-stimulated GTPase activity (Sup- 
plementary Fig. 7). This mutant further behaved in a dominant- 
negative fashion in transferrin uptake assays (Fig. 3b) and displayed 
a diffuse localization and reduced recruitment to clathrin-coated pits 
(Supplementary Fig. 8). These results point to the central role of 
interface-3 for the function of dynamin. 


Discussion 

The present work, combined with prior studies , suggests a struc- 
tural model for the mechanochemical coupling in dynamin that is 
consistent with previous models”*’. Our structural analysis indicates 
that the diameter of helical dynamin assemblies is controlled by the 
angle between two stalk dimers (Supplementary Figs 11b and 12f-h). 
We suggest that this angle is adjusted in response to GTP-dependent 
dimerization of the G domains: a relaxed conformation of interface-1 
is adopted in the absence of G domain constraints, whereas GTP- 
triggered dimerization of the G domains constrains rotation in inter- 
face-1 and induces a bent, constricted conformation (Supplementary 
Fig. 11b), possibly via the BSE-stalk interface (Fig. 3a). Interestingly, 
in molecular dynamics simulations of two stalk dimers, the bent 
conformation of interface-1 rapidly converts towards a relaxed state, 
concomitant with an opening of the helical oligomer (Fig. 4b and 
Supplementary Fig. 12). This supports our assumption that con- 
straints from G domains dimerization are required for the stabiliza- 
tion of the constricted state. 

Accordingly, dynamin initially assembles via the stalks with 
interface-1 in a relaxed conformation, allowing the filament to adopt 
a range of different diameters’ (state I in Fig. 4b and Supplementary 
Fig. 12). When the filament has embraced its template, GTP-loaded G 
domains of adjacent turns dimerize, the inhibitory stalk-BSE inter- 
face (Fig. 3a) is disrupted and the bent conformation of interface-1 is 
induced. This will result in constriction of the filament if the lipid 
template is flexible (state IV in Fig. 4b, constrictase model'®) or com- 
paction of the dynamin helix if the lipid template is rigid (state II in 
Fig. 4b, poppase model'’). Constriction of a long dynamin helix will 
induce a sliding of neighbouring filaments until a new constricted 
equilibrium position of the oligomer is reached. This sliding is 
observed as a rotary movement of the dynamin helix upon addition 
of GTP in real-time imaging assays (twistase model’). To reach the 
constricted state along the whole assembly, several cycles of local 
release and rebinding of neighbouring dynamin turns might be trig- 
gered by GTP-dependent dimerization of G domains and dissociation 
after GTP hydrolysis (state III in Fig. 4b). Accordingly, GTP binding 
and hydrolysis are both required for the mechanochemical function of 
dynamin** and might induce local opening or twisting of the con- 
stricted dynamin helix. The resulting shear forces could tear the 
underlying membrane. 
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The complex domain interplay in dynamin rationalizes the high 
degree of structural conservation in the dynamin superfamily, sheds 
light on the molecular mechanisms of disease-associated mutations 
and highlights structural features of the nucleotide-free state as a 
prerequisite to understand dynamin’s mechanochemical function. 


METHODS SUMMARY 


A human dynamin 1 construct (residues 6-746) containing the IHGIR395- 
399AAAAA and K562E mutations was expressed as a His¢-tag-fusion in 
Escherichia coli and purified to homogeneity. Crystals were obtained using 
PEG400 and isopropanol as precipitants. The structure was solved by molecular 
replacement. To verify the model, the positions of 19 out of 26 methionines were 
determined by an anomalous data set of SeMet-substituted crystals. Liposome 
binding assays were carried out as described (http://www.endocytosis.org). 
GTPase assays were carried out at 37°C using 20 mM HEPES/NaOH, pH7.5, 
150 mM NaCl, 2mM KCl, 2mM MgCl) as reaction buffer. Uptake of fluores- 
cently labelled transferrin was monitored in HeLa cells depleted of endogenous 
dynamin 2 by siRNA and transfected with the indicated siRNA-resistant dyna- 
min 2 constructs. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Protein expression and purification. Human dynamin 1 (residues 6-746) and 
indicated mutants of this construct were expressed from pET46-EK/LIC 
(Novagen) as N-terminal His¢-tag fusion followed by a PreScission cleavage site. 
The crystallized construct contained mutations IHGIR395-399AAAAA which 
prevented oligomerization and K562E which reduced DNA contaminations dur- 
ing purification. Proteins were expressed in Escherichia coli host strain Rosetta2- 
BL21-DE3, and bacteria were cultured in autoinduction medium (Novagen) for 
7.5h at 37 °C followed by temperature shift to 20°C for overnight expression. 
Selenomethionine (SeMet)-substituted human dynamin 1 was expressed in M9 
minimal medium, supplemented with L-amino acids Lys, Phe, Thr (100 mg1'), 
Ile, Leu, Val, SeMet (50 mg] ~ 1) using the same vector and host strain as for native 
protein expression”. Cells were resuspended in buffer A (25 mM HEPES/NaOH 
(pH 7.9), 500mM NaCl, 2mM MgCl, 1 4M DNase (Roche), 500 uM Pefabloc 
(Roth)) and disrupted by a microfluidizer (Microfluidics). Cleared lysates 
(95,000g, 1h, 4°C) were incubated with Benzonase (Novagen) for 30 min at 
4°C before application to a Co”*-Talon column (Clontech). Protein was eluted 
with buffer A containing additional 100mM imidazole. Fractions containing 
human dynamin 1 were incubated with 2.4mM -mercaptoethanol and His¢- 
tagged Prescission protease overnight at 4°C. Using 50 kDa molecular weight 
cut-off concentrators (Amicon), imidazole, B-mercaptoethanol and the free His- 
tag were removed by washing with buffer A, before a second application to a 
Co**-Talon column to remove non-cleaved His-tagged dynamin 1 and protease. 
The flow-through and four column volumes of washing buffer A were collected 
and concentrated. Finally, dynamin 1 was purified by size exclusion chromato- 
graphy on a Superdex200 column (GE) in buffer containing 25 mM HEPES/ 
NaOH (pH 7.5), 300 mM NaCl, 2mM MgCh. Fractions containing dynamin 1 
were pooled, concentrated and flash-frozen in liquid nitrogen (Supplementary 
Fig. 1). SeMet-substituted protein was purified using the same protocol. 
Crystallization and structure determination. Crystallization trials by the 
sitting-drop vapour-diffusion method were performed at 4°C using a Hydra- 
plus-One pipetting robot (Matrix Technologies Corporation) and Rock Imager 
storage system (Formulatrix). The human dynamin 1 construct (300nl at a 
concentration of 12mgml~') was mixed with an equal volume of reservoir 
solution containing 9% PEG400, 6% isopropanol, 100mM HEPES/NaOH 
(pH7.3), 10 mM MgCl, and 10 mM KCI. Crystals of the native protein appeared 
after 2 weeks and had dimensions of 0.2 mm X 0.05 mm X 0.02 mm. Crystals of 
SeMet protein were obtained in 6% MPD, 10% isopropanol, 0.1 M HEPES/NaOH 
(pH7.5). During flash-cooling of the crystals in liquid nitrogen, a cryo-solution 
containing additionally 20% ethylene glycol was used. All data were recorded at 
BL14.1 at BESSY II, Berlin. One native data set was collected from a single crystal 
and processed and scaled using the XDS program suite*’. Phase information was 
obtained by molecular replacement with Phaser using the structure of the 
nucleotide-free rat dynamin 1 G domain"®, the stalk of human MxA” and the 
human PH domain” as search models. The model was built using COOT* and 
iteratively refined using CNS1.3 with a deformable elastic network“, followed by 
10 cycles of TLS refinement using 1 TLS group per domain in refmac5 (ref. 45). To 
confirm the amino acid sequence, a data set of a SeMet-substituted protein crystal 
was recorded at the selenium peak wavelength. Owing to non-isomorphism of the 
native and SeMet crystals, molecular replacement with the refined native struc- 
ture was performed. The positions of 19 out of 26 selenium atoms in the dynamin 
1 construct were determined by calculating an anomalous difference map with the 
CCP4 program suite** using the calculated phases after refinement. The final 
native model comprises amino acids 6-746. Residues 63-64, 110-112, 143- 
149, 347-356, 394-404, 446-447, 500-517, 534-537, 578-581 and 632-652 are 
disordered. Electron density for loop L2®* (amino acids 709--715) was only 
visible in the SeMet derivative structure and was therefore not modelled in the 
native structure. However, L2°° was included for figure preparation. Of all resi- 
dues, 86.9% are in the most favoured regions of the Ramachandran plot and no 
residue in the disallowed regions, as analysed with Procheck’’. Figures were 
prepared with PyMol*. Buried surface areas (per molecule) were calculated using 
CNS. Solvent-accessible areas per residue were calculated using areaimol”. 
Domain superpositions were performed with lsqkab*’. Sequences were aligned 
using CLUSTAL W” and adjusted by hand. 

Liposome co-sedimentation assays. Liposomes were prepared as previously 
described (http://www.endocytosis.org). Folch liposomes (total bovine brain 
lipids fraction I from Sigma) in 20 mM HEPES/NaOH (pH 7.5), 150mM NaCl, 
2mM KCl were extruded through a 0.1 um filter. Liposomes (0.1 mg ml’) were 
incubated at room temperature with 4.0 [1M of the indicated dynamin 1 construct 
for 10 min in 40 ul reaction volume, followed by a 70,000g spin for 10 min at 
20°C. 

GTP hydrolysis assay. GTPase activities of 1 1M of the indicated dynamin 1 
constructs were determined at 37 °C in 20 mM HEPES/NaOH (pH/7.5), 150 mM 


NaCl, 2mM KCl, 2mM MgCl, in the absence and presence of 0.1 mg ml! 
0.1-p1m-filtered Folch liposomes, using saturating concentrations of GTP as sub- 
strate (1 mM for the basal and 3 mM for the stimulated reactions). Reactions were 
initiated by the addition of protein to the reaction. At different time points, 
reaction aliquots were diluted 15-fold in GTPase buffer and quickly transferred 
to liquid nitrogen. Nucleotides in the samples were separated via a reversed-phase 
Hypersil ODS-2 C18 column (250 X 4mm), with 10 mM tetrabutylammonium 
bromide, 100 mM potassium phosphate (pH 6.5), 7.5% acetonitrile as running 
buffer. Denatured proteins were adsorbed at a C18 guard column. Nucleotides 
were detected by absorption at 254nm and quantified by integration of the 
corresponding peaks. Rates were derived from a linear fit to the initial reaction 
(<40% GTP hydrolysed). 

Transferrin uptake in HeLa cells. HeLa cells were transfected with siRNA using 
Oligofectamine (Invitrogen) on day 1. The sequence of the siRNA against human 
dynamin 2 is 5’-GCAACUGACCAACCACAUC-3’, targeting nucleotides 849- 
867. On day 2, cells were transfected with pEGFP-N1 (Clontech) or siRNA-resistant 
rat dynamin 2-pEGFP-N1 using lipofectamine 2000 (Invitrogen) and seeded on 
coverslips. On day 3, cells were serum-starved and incubated with 20gml! 
transferrin-Alexa568 (Molecular Probes, Invitrogen) for 10 min at 37 °C. After four 
PBS washes on ice, cells were paraformaldehyde-fixed for 20 min at room temper- 
ature. Transferrin uptake was analysed using a Zeiss Axiovert200M microscope and 
Slidebook imaging software (3i Inc.). Internalized transferrin was quantified from 
transfected cells only and normalized to the value of eGFP-transfected, mock- 
treated cells (n = 28-83 images, five independent experiments; IHGVR395- 
399A AA, I10D, QD17-18AA: three independent experiments; E368K, R369W, 
S619L, F372D: two independent experiments). Knockdown of dynamin 2 and 
expression levels of dynamin 2-eGFP mutants were assessed by immunoblotting 
using antibodies to dynamin 2 (a gift of M. A. McNiven), B-actin (clone ac-15, 
Sigma-Aldrich) and eGFP (clone jl-8, Clontech). 

Localization of dynamin 2-eGFP mutants. HeLa cells were transfected with 
dynamin 2-pEGFP-N1 wild-type or mutant constructs 20 h before fixation in 4% 
paraformaldehyde for 12 min at room temperature. Cells were blocked and per- 
meabilized in 10% goat serum, 0.3% Triton X-100, 100 mM NaCl in phosphate 
buffer and stained for endogenous adaptor complex AP-2 using an antibody to 
a-adaptin (clone AP-6, Abcam). Total internal reflection fluorescence (TIRF) 
imaging was performed using a Zeiss Axiovert200M microscope equipped with 
a X100 TIRF objective and a dual-colour TIRF setup from Visitron Systems using 
Slidebook imaging software (3i Inc.). 

Loop modelling and molecular dynamics simulations. For modelling of the 
unresolved loop regions L1N® and L2°, two stalk dimers in the constricted state 
served as scaffold. Using Modeller (9v8)*°, the scaffold was fixed in position, 
whereas LIN® and L2° could freely sample the empirical potential function. To 
reduce the conformational search space, additional harmonic distance restraints 
were added between conserved residues Arg 399—Asp 406 and Glu 355—Arg 361. 
Based on the modelled stalk tetramer, five independent all-atom molecular 
dynamics simulations (NVT ensemble), each of 90ns, were conducted at 
T = 300K in a periodic boundary setting using Gromacs 4.5.3 (ref. 51). The 
model was immersed in a rectangular 20 nm X 10 nm X 9 nm box, containing 
approximately 56,400 water molecules, 21 sodium and 17 chloride ions to neut- 
ralize the system, resulting in a total number of 185,857 atoms. As force fields, 
Amber99 (protein and ions)” and TIP3P (water)°”’ were applied. To treat long- 
range interactions, the Particle-mesh Ewald method*** was used. A cut-off of 
lnm was used for the real parts of electrostatic and van der Waals interactions. 
All hydrogen bonds were constrained by using the LINCS” algorithm, allowing 
for an integration time-step of 2 fs. For the thermostatted integration, Langevin 
dynamics were used as implemented by the Gromacs s.d. integrator (tau_t = 1). 

For the calculation of bending and twisting angles, each of the four stalk mono- 
mers was represented by two geometric centres, defined as the mean position of Cx 
atoms of residues 366-377, 420-430, 468-481 and 671-683 for position A and 
residues 360-365, 428-445, 457-472 and 686-701 for position B (Supplementary 
Fig. 12d). The stalk bending angle « was defined as the mean angle between parallel 
stalks, and the twisting angle / by the minimal angle between the planes spanned 
by each dimer (positions A, B, B’ in Supplementary Fig. 12f). 

For each simulation time step, the corresponding stalk tetramer structure 
describes a linear transformation of the first dimer onto the second dimer, con- 
sisting of a translation vector and a rotation matrix. This linear transformation was 
used to reconstruct the structure of an ideal dynamin helix by applying it to the 
dynamin dimer model in the constricted state. The diameter, d, and the rise per 
turn, r, of these helices were measured by using the geometric centres of the stalk 
coordinates and obtaining trajectories in (d,r). Based on these trajectories, the free 
energy surface of stalk helix conformations was calculated: the two-dimensional 
space (d,r) was discretized into boxes of size 25 X 25 A. Based on the simulation 
trajectories, the transition probability between all pairs of boxes was computed, 


©2011 Macmillan Publishers Limited. All rights reserved 


which allowed the calculation of an equilibrium probability of finding a single stalk 
tetramer ina given box, p;(d,r 57, When more than two stalk dimers are assembled, 
non-cooperative behaviour of neighbouring dimers has to be considered, for 
example, the assembled stalk dimers can almost independently switch between 
different conformations. The resulting equilibrium distribution of two independ- 
ent tetrameric units would be given by the convolution of two single-tetramer 
distributions, p2(d,r). It was found that for only about three such convolutions, 
the resulting probability distribution converges to p3(d,r) ~ p(d,r). Thus, assuming 
that the helix has at least three independently switching subunits, the free energy 
landscape is unique, and is given by F(d,r) = —kTIn(p(d,r)), where k is the 
Boltzmann constant and T the temperature. 
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High-fidelity projective read-out of a solid-state spin 


quantum register 


Lucio Robledo', Lilian Childress**, Hannes Bernien!*, Bas Hensen!, Paul F. A. Alkemade! & Ronald Hanson! 


Initialization and read-out of coupled quantum systems are essential 
ingredients for the implementation of quantum algorithms’. 
Single-shot read-out of the state of a multi-quantum-bit (multi- 
qubit) register would allow direct investigation of quantum correla- 
tions (entanglement), and would give access to further key resources 
such as quantum error correction and deterministic quantum tele- 
portation’. Although spins in solids are attractive candidates for 
scalable quantum information processing, their single-shot detec- 
tion has been achieved only for isolated qubits*°. Here we demon- 
strate the preparation and measurement of a multi-spin quantum 
register in a low-temperature solid-state system by implementing 
resonant optical excitation techniques originally developed in 
atomic physics. We achieve high-fidelity read-out of the electronic 
spin associated with a single nitrogen-vacancy centre in diamond, 
and use this read-out to project up to three nearby nuclear spin 
qubits onto a well-defined state’. Conversely, we can distinguish 
the state of the nuclear spins in a single shot by mapping it onto, 
and subsequently measuring, the electronic spin**. Finally, we show 
compatibility with qubit control: we demonstrate initialization, 
coherent manipulation and single-shot read-out in a single experi- 
ment on a two-qubit register, using techniques suitable for exten- 
sion to larger registers. These results pave the way for a test of 
Bell’s inequalities on solid-state spins and the implementation of 
measurement-based quantum information protocols. 

The electronic spin of the nitrogen-vacancy centre (NV) in dia- 
mond constitutes an exceptional solid-state system for investigating 
quantum phenomena, combining excellent spin coherence”? with a 
robust optical interface’*’*. Furthermore, the host nitrogen nuclear 
spin (typically '“N, with nuclear spin [= 1) and nearby isotopic 
impurity ‘°C nuclei (I= 1/2) have hyperfine interactions with the 
NV’s electronic spin (S=1), allowing development of few-spin 
quantum registers that have been suggested as building blocks for 
quantum repeaters’’, cluster state computation’® and distributed 
quantum computing”. All of these applications require high-fidelity 
preparation, manipulation and measurement of multiple spins. There 
have been significant advances in coherent control over few-spin sys- 
tems in diamond*”', but no method exists for the simultaneous pre- 
paration’*” and single-shot read-out’ of multi-spin registers, which 
impedes progress towards multi-qubit protocols. Here we remove this 
obstacle by exploiting resonant excitation techniques, as pioneered in 
atomic physics***, in microstructured diamond devices that allow 
high photon collection efficiency (Fig. 1a). These new methods enable 
us to initialize multiple nuclear spin qubits and to perform single-shot 
read-out of a few-qubit register, clearing the way towards implementa- 
tion of quantum algorithms with solid-state spins. 

Our preparation and read-out techniques rely on resonant excita- 
tion of spin-selective optical transitions of the NV, which can be 
spectrally resolved at low temperatures’. We use the E, and A, 
transitions in our experiments (Fig. 1b): A; connects the ground states 
with spin projection ms = +1 to an excited state with a primarily 


ms = +1 character, whereas E, connects states with ms = 0. A typical 
spectrum of NV A, one of the two NVs we study, is shown in Fig. 1c 
(see Supplementary Information for NV B). Under resonant excitation 
of a single transition, the fluorescence decays with time owing to a 
slight spin mixing within the excited states that induces shelving into 
the other spin state (Fig. 1d). This optical pumping mechanism allows 
high-fidelity spin state initialization**’’: from the data in Fig. 1d, we 
estimate a preparation error into the ms;=O ground state of 
0.3 + 0.1%, which is a drastic reduction of the 11 + 3% preparation 
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Figure 1 | Resonant excitation and electronic spin preparation of a 
nitrogen-vacancy centre. a, Scanning electron microscope image of a solid 
immersion lens representative of those used in the experiments (for details, see 
Supplementary Information). The overlaid sketch shows the substitutional 
nitrogen and the adjacent vacancy that form the NV. Inset, scanning confocal 
microscope image of NV A (logarithmic colour scale). kct, 1,000 counts. 

b, Energy levels used to prepare and read out the NV’s electronic spin (S = 1 in 
the ground and optically excited states); transitions are labelled according to the 
symmetry of their excited states. Dashed lines indicate spin-non-conserving 
decay paths. MW, microwave transition. c, Photoluminescence excitation 
spectrum of NV A; frequency is given relative to 470.443 THz. d, Fluorescence 
time trace of NV A, initially prepared in ms = 0 (E, excitation, 4.8-nW power) 
and mg = + 1 (A, excitation, 7.4-nW power; inset), with a saturation power 
Pat ~ 6 nW. Spin flips in the excitation cycle lead to nearly exponential decay of 
fluorescence, with fitted spin-flip times of 1/yp = 8.1 + 0.1 ts for E, and 

0.39 + 0.01 ts for Aj, and initial respective intensities of 740 + 5 and 

95 + 2kcts ', giving a lower limit of 99.7 + 0.1% to the ms; = 0 preparation 
fidelity and one of 99.2 + 0.1% to the ms = £1 preparation fidelity. The low 
initial intensity for A, is associated with a fast intersystem crossing to 
metastable singlet states (Supplementary Information). 
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error observed with conventional off-resonant initialization (Sup- 
plementary Information). 

Spin-dependent resonant excitation also allows single-shot elec- 
tronic spin read-out: the presence or absence of fluorescence under 
E, excitation reveals the spin state. By working with low-strain NVs at 
low temperature (T= 8.6K), we suppress spin mixing*®** and 
phonon-induced transitions” within the excited states, extending 
the spin relaxation time under E, excitation to several microseconds. 
Together with a high collection efficiency due to the use of solid 
immersion lenses” fabricated around pre-selected, low-strain NVs, 
and efficient rejection of resonant excitation from the measured 
phonon-sideband emission, this highly spin-preserving transition 
allows the detection of several photons before the spin flips. 

We demonstrate single-shot read-out by initializing the electronic 
spin to have ms = 0 or ms = +1, followed by resonant excitation of the 
E, read-out transition for t.. = 100 us (Fig. 2a). The resulting histo- 
grams of the number of detected photons are given in Fig. 2b. As 
expected, for ms = +1 we observe negligible excitation, with a 98.3% 


probability of not measuring any photons during the probe interval. By 
stark contrast, after initializing the spin to have ms = 0 we detect an 
average of (no) = 6.4 photons per shot. We assign the state ms = 0 to 
detection of one or more photons, and ms = +1 to the detection of no 
photons. After truncating our integration window to the optimal dura- 
tion of 40 1s, we find an average fidelity of 


1 
Fayg —_ 2 (Fg =0 + Fing= +1) = 93.2 ain 0.5% 
where F,,, is the probability of obtaining the measurement outcome 
msg after optical pumping into ms. To verify that these measurement 
outcomes indeed correspond to the electronic spin states, we use 
single-shot read-out to observe spin Rabi oscillations and microwave- 
induced quantum jumps” (Fig. 2d,e). 

Whereas the full read-out optically pumps the spin, shorter read- 
out durations can be non-destructive, albeit at lower fidelity. By 
optimizing integration windows, we obtain a fidelity of 83.4 + 0.5% 
for each of two successive read-out segments (Fig. 2c). Correlations 
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Figure 2 | Projective single-shot read-out of the NV’s electronic spin. 

a, Pulse sequence used for electronic spin read-out: after charge initialization 
(532 nm; not shown) the electron is pumped into the ms = 0 (A, dark red) or 
ms = +1 (E, bright red) state, and then undergoes optional microwave spin 
manipulation and spin read-out by a pulse resonant with E,. Conditioning on 
simultaneous resonance during the final charge- and detuning-sensing stage 
eliminates effects of local electric field fluctuations or ionization 
(Supplementary Information). b, Statistics of photon counts detected during a 
tro = 100-s electronic spin read-out after initialization into the respective 

ms = +1 (red) and mg = 0 (superimposed light grey and inset) states, obtained 
from 10,000 measurement repetitions. c, When the 100-ms read-out pulse is 
divided into two read-out segments, R1 and R2, with a variable division point, 
the fidelity of two consecutive segments reaches 83.4 + 0.5% for an optimal 
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Continuous spin read-out 


Charge read-out 


division time of 5.5 1s; the probability of identical sequential outcomes is 
82.0 + 0.7%. Error bars (2 s.e., n = 10,000) are smaller than the symbols. 

d, Electronic spin Rabi oscillations between ms = 0 and ms = —1 at axial 
magnetic field B, ~ 15 G (purple): each data point comes from 1,000 single- 
shot read-out repetitions. The fit, which includes the detailed hyperfine level 
structure, yields a visibility of 78 + 8%, where a maximum of 84% can be 
expected. Blue data points show the measurement outcome after projection 
into ms = 0 by selecting only read-out events with photons detected within the 
first 400 ns (Supplementary Information). All errors and error bars, 2 s.e. 

e, Quantum jumps in the fluorescence time trace during continuous spin read- 
out. Durations of dark periods depend on the microwave Rabi frequency 
(Supplementary Information). Blue data indicates the counts per 5-,1s read-out 
bin, and the deduced spin state is shown in orange. 
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between measurement outcomes indicate that the read-out is projec- 
tive. Following preparation of a superposition of spin states, we con- 
dition on detection of at least one photon (that is, measurement 
outcome ms = 0) during a first short read-out pulse, and probe the 
resulting spin state with a second read-out (Fig. 2d, blue data points). 
Regardless of the initial spin state, we observe a constant high prob- 
ability of obtaining ms = 0 in the second read-out. This shows that the 
read-out method is strongly projective and well suited for application 
in measurement-based quantum protocols. 

We use projective read-out of the electronic spin in combination 
with quantum gate operations for initialization and read-out of a few- 
qubit nuclear spin register. We first demonstrate the concept of mea- 
surement-based preparation on a single nuclear qubit. The electronic 
spin resonance spectrum for NV B (Fig. 3a, green trace) reveals the 
coupling to the host I = 1 '*N nuclear spin: two partly overlapping sets 
of three hyperfine lines correspond to the ms=0<9~-1 and 
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Figure 3 | Nuclear spin preparation and read-out. a, Measurement-based 
preparation of a single '*N nuclear spin. In Earth’s ambient magnetic field of 
~0.5G, without nuclear spin polarization we observe four resonances in the 
hyperfine spectrum (green trace) for NV B; the outer two correspond to the 
nuclear spin state with m,; = —1 and the central two are combinations of the 
states with m,; = {0, +1}. PL, photoluminescence intensity. Red and brown 
traces indicate transitions to the ms = +1 and ms = —1 states, respectively. As 
indicated in the circuit diagram, to initialize the nuclear spin we entangle it with 
the electronic spin and then read out the latter; p denotes the number of 
preparation steps, each of which is one iteration of the section of the circuit 
diagram enclosed in parentheses. Mong indicates the conditioning 
measurement, M, the electron spin read-out and U,;w the microwave spin 
manipulation. Data for the m; = —1 preparation is shown in orange; data for 
the m, = {0, +1} preparation is shown in cyan. Fits to Gaussian spectra show an 
amplitude ratio of 96 + 4% in the desired nuclear spin state. b, Measurement- 
based preparation of a three-nuclear-spin register. Using a similar sequence 
(circuit diagram), we prepare a well-defined state for all three nuclear spins. A 
portion of the uninitialized hyperfine spectrum (green) contains 12 partly 
superposed lines, of which we prepare the single line corresponding to 
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ms =0€ +1 electronic spin transitions, Zeeman-split by ~2 MHz 
in Earth’s magnetic field. The outermost transitions are associated with 
a specific nuclear spin state with spin projection my, for example (ms, 
my) = (0, -1) © (1, —1) at 2.874 GHz. Our initialization procedure 
works as follows (Fig. 3a, circuit diagram). First we prepare the elec- 
tronic spin in ms = +1. We then perform a nuclear-spin-controlled 
NOT operation on the electronic spin by applying a m-pulse at 
2.874 GHz; this operation rotates the electronic spin into ms = 0 only 
when m, = —1. Finally we read out the electronic spin for 400 ns. If one 
or more photons are detected during this interval, the two-spin system 
is projected into (ms, m;) = (0, —1). Alternatively, if we run the same 
protocol with initial electronic spin state ms = 0, we prepare the nuc- 
lear spin with m, = {0, +1}. 

The efficiency of the nuclear spin initialization is evidenced by its 
drastic effect on the electronic spin resonance spectrum (Fig. 3a). 
Whereas before preparation the depths of the different hyperfine lines 
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my = (—1, 1/2, 1/2) (orange). Gaussian fits constrained to known hyperfine 
splittings yield an amplitude ratio of 88 + 10%. The observed visibility can be 
improved by performing two preparation steps and electronic spin repumping 
(p = 2; five red data points), yielding a contrast of 82% of the expected visibility 
from known read-out fidelity (Supplementary Information). Uncertainties and 
error bars, 2 s.e. (1 = 1,000 for a and b with p = 1; selected from 10,000 
measurement runs for b with p = 2). c, Single-shot measurement of the '*N 
nuclear spin, preceded by two preparation steps (p = 2). Read-out (three 
repetitions) conditioned on successful preparation distinguishes m;= — 
(orange; threshold <1 count) from m, = {0, +1} (cyan) with an average fidelity 
of 92 + 2% (Supplementary Information). d, Multiple-nuclear-spin read-out. 
Using a sequence similar to that in c (inset), we distinguish one of the 12 
hyperfine states associated with NV A. To prepare nuclear spin states, we 
perform the read-out procedure seven times and keep only data with zero total 
counts (identified as m; = (—1, 1/2, 1/2)) or =2 counts per initialization step 
(other states). Subsequent read-out with six repetitions (m; = (—1, 1/2, 1/2); 
state discrimination threshold, <3 total counts) achieves a 96.7 + 0.8% average 
fidelity for preparation and detection of the nuclear spin configuration. 
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indicate an equal mixture of the nuclear spin states (green trace), after 
preparation only the hyperfine lines corresponding to the prepared 
states are visible (m; = —1 for the orange trace and m, = {0, +1} for 
the cyan trace). 

The same nuclear spin initialization scheme can be applied to multi- 
qubit registers. Figure 3b shows the electronic spin resonance spec- 
trum of NV A (green trace), whose electronic spin is coupled to both 
the host '*N nuclear spin and two nearby '°C nuclei (Supplementary 
Information). The lowest-frequency line corresponds to a single state 
of the three nuclear spins. A m-pulse on this transition therefore imple- 
ments a triple-controlled NOT operation on the electronic spin 
(Fig. 3b, circuit diagram), allowing the initialization of all three nuclear 
spins (Fig. 3b, orange trace). The initialization can be further improved 
by repeating the preparation step (Fig. 3b, red data points). 

The nuclear qubits can be read out in a single shot by applying a 
nuclear-spin-controlled NOT operation to the electronic spin and 
subsequently reading out the electronic spin (Fig. 3c, d, inset circuit 
diagrams). Because the back-action of the electronic spin measure- 
ment on the nuclear spin is weak, we can repeat the process to obtain 
higher read-out fidelity’*. Figure 3c compares the resulting photon 
statistics for NV B after initialization into the single nuclear spin state 
m, = —1 with those obtained for m; = {0, +1}, and indicates an aver- 
age read-out fidelity of 92 + 2%. This number is a lower bound on the 
true read-out fidelity, as it includes errors in state preparation. 

A straightforward extension of this scheme can be used to read out 
the complete state of a register of multiple nuclear spins. By using a 


multiply controlled NOT gate in the read-out sequence, we can mea- 
sure in a single shot whether the register is in a particular configura- 
tion. We demonstrate this procedure on NV A, where we identify the 
three-nuclear-spin state m; = (—1, 1/2, 1/2) (Fig. 3d and Supplemen- 
tary Information). The other possible configurations can be probed by 
sequential application of this read-out scheme to different spectrally 
resolved hyperfine transitions, or, alternatively, by systematically 
flipping the nuclear spin qubits and repeating the read-out on the same 
hyperfine transition. 

Electronic-nuclear flip-flop processes in the optically excited state, 
which reduce the nuclear spin read-out fidelity, pose a significant 
hurdle to scaling the read-out to more qubits. Critically, resonant 
read-out allows selection of which states undergo optical excitation. 
By starting with the electronic spin in an ms = +1 state, optical excita- 
tion will occur only when the register is in the state being probed; 
therefore, no optically induced nuclear spin flips will occur during 
measurement of any of the other states. Thus, by contrast with schemes 
depending on off-resonant excitation where each additional read-out 
step degrades the fidelity, resonant excitation allows scaling of high- 
fidelity read-out to larger registers. 

Finally, we demonstrate the compatibility of all the different tech- 
niques discussed here by implementing them in a single experiment: 
we initialize, coherently manipulate and then read out a two-qubit 
register consisting of the electronic spin and '*N nuclear spin of NV 
B. After initializing it to have (ms, m;) = (0, —1), we rotate the nuclear 
spin using a radio-frequency pulse and subsequently rotate the 
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Figure 4 | Initialization, manipulation and read-out of a two-qubit register. 
a, After initialization of NV B into the (ms, m;) = (0, —1) state, we use radio- 
frequency excitation (spin rotation Rpy at 4.9464 MHz) to drive the nuclear 
spin and then microwaves (spin rotation Ryyw at 2.8774 GHz) to drive the 
electronic spin. The electronic spin state is subsequently measured for 15 is, 
and this is followed by five read-out steps (each of 10 pts) of the “N nuclear spin 
state. b, Probability of observing ms = 0 conditional on the measured nuclear 
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spin state and averaged over all radio-frequency pulse durations, as a function 
of microwave pulse duration (Supplementary Information). ¢, Probability of 
observing m; = —1 conditional on the observed electronic spin state and 
averaged over all microwave pulse durations, as a function of radio-frequency 
pulse duration (Supplementary Information). Error bars and uncertainties, 
2s.e.; data in a are based on 1,000 measurements per pixel. 
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electronic spin with a microwave pulse. We then read out the elec- 
tronic spin and, subsequently, the MN nuclear spin state (Fig. 4a, 
circuit diagram). The left-hand plot in Fig. 4a shows the read-out 
results for the electronic spin qubit, showing Rabi oscillations as a 
function of microwave pulse length. By contrast, the read-out results 
for the nuclear spin qubit (right-hand plot of Fig. 4a) show Rabi 
oscillations as a function of the radio-frequency pulse length. 

To quantify crosstalk, we closely examine correlations between the 
two measurement outcomes. We observe that the electronic spin Rabi 
oscillation depends on the outcome of the nuclear read-out (Fig. 4b), 
but this dependence can be fully accounted for by the finite microwave 
power used in this experiment (Supplementary Information). The 
observed correlations thus arise from imperfect manipulation rather 
than from measurement crosstalk. For the nuclear spin read-out, 
however, true measurement crosstalk appears: nuclear Rabi oscillation 
amplitudes decrease when the electronic spin is measured to be in 
ms = 0 (Fig. 4c) because optical excitation during electronic spin 
read-out (which only succeeds for mgs = 0) induces nuclear spin relaxa- 
tion (see Supplementary Information for details). This effect can be 
mitigated by improving the collection efficiency (thus reducing the 
read-out duration), for example by integrating the NV into an optical 
cavity. Also, application of moderate magnetic fields can decrease the 
optically induced nuclear spin relaxation rate by orders of magnitude’. 

Our results have implications for a broad range of spin-based appli- 
cations. Single-shot electronic spin read-out can drastically improve 
NV-based sensors by allowing fast, quantum-projection-limited 
detection, creating opportunities in low-temperature magnetome- 
try*’**. Extension of nuclear spin preparation techniques to remote 
nuclei in the spin bath may permit line-narrowing for enhanced sensi- 
tivity to d.c. magnetic fields. Furthermore, the preparation, manipula- 
tion and single-shot read-out of two spins opens the door to the 
exploration of two-particle quantum correlations, such as Bell’s 
inequalities, and elementary quantum information processing proto- 
cols. Importantly, the techniques we describe are extendable to larger 
spin registers, and can be combined with precise spin qubit control and 
dynamical decoupling to give coherence protection’””*. The prepara- 
tion and read-out fidelities reported here are sufficient for demonstrat- 
ing measurement-based entanglement generation and quantum 
teleportation of spin qubits, and for exploring elementary quantum 
error correction schemes’. Ultimately, the integration of multi-spin 
registers with quantum optical channels by means of spin—photon 
entanglement'* may allow their application as few-qubit nodes in 
long-distance quantum communication protocols or distributed 
quantum information processing networks. 


METHODS 


All data were obtained by detecting photons emitted into the phonon sideband 
(wavelength, 650-750nm). For photoluminescence excitation spectroscopy, 
excitation with 5.5-nW red light is applied while microwaves at 2.878 GHz, 
coupled through an on-chip stripline, drive the electronic spin transitions to 
prevent optical pumping. Scans are recorded in a single laser frequency sweep 
at ~200MHzs ', and are preceded by a 10-pls pulse of 532-nm excitation 
(50 1W). The green light is necessary to reset the negative-charge state of the 
NV, which can be photoionized by continuous resonant excitation. For all other 
experiments, 532-nm-induced spectral diffusion must also be controlled: to ensure 
that the NV is on resonance with the red excitation lasers, we condition our data on 
strong fluorescence upon simultaneous E, and A, excitation following the experi- 
mental sequence (details in Supplementary Information). All errors and error bars 
are 2s.e. statistical uncertainty in the mean (95% confidence interval). 
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A reserve stem cell population in small intestine 
renders Lgr5-positive cells dispensable 


Hua Tian’, Brian Biehs”, Soren Warming’, Kevin G. Leong’, Linda Rangell*, Ophir D. Klein* & Frederic J. de Sauvage! 


The small intestine epithelium renews every 2 to 5 days, making it 
one of the most regenerative mammalian tissues. Genetic inducible 
fate mapping studies have identified two principal epithelial stem cell 
pools in this tissue. One pool consists of columnar Lgr5-expressing 
cells that cycle rapidly and are present predominantly at the crypt 
base’. The other pool consists of Bmil-expressing cells that largely 
reside above the crypt base”. However, the relative functions of these 
two pools and their interrelationship are not understood. Here we 
specifically ablated Lgr5-expressing cells in mice using a human 
diphtheria toxin receptor (DTR) gene knocked into the Lgr5 locus. 
We found that complete loss of the Lgr5-expressing cells did not 
perturb homeostasis of the epithelium, indicating that other cell 
types can compensate for the elimination of this population. After 
ablation of Lgr5-expressing cells, progeny production by Bmil- 
expressing cells increased, indicating that Bmil-expressing stem 
cells compensate for the loss of Lgr5-expressing cells. Indeed, 
lineage tracing showed that Bmil-expressing cells gave rise to 
Lgr5-expressing cells, pointing to a hierarchy of stem cells in the 
intestinal epithelium. Our results demonstrate that Lgr5-expressing 
cells are dispensable for normal intestinal homeostasis, and that 
in the absence of these cells, Bmil-expressing cells can serve as an 
alternative stem cell pool. These data provide the first experimental 
evidence for the interrelationship between these populations. The 
Bmil-expressing stem cells may represent both a reserve stem 
cell pool in case of injury to the small intestine epithelium and a 
source for replenishment of the Lgr5-expressing cells under non- 
pathological conditions. 

Two types of stem cells have been described in the small intestine 
based on location and cycling dynamics'*. Fast-cycling stem cells 
express markers including Lgr5, Cd133 (also known as Prom1) and 
Sox9 (refs 1, 5, 6) and are present throughout the intestine. Also known 
as crypt base columnar cells (CBCs), these slender cells populate the 
crypt and villi within 3 days, and are interspersed among the Paneth 
cells that support them”*. Slower-cycling stem cells, marked by 
enriched expression of Bmil or mouse Tert (mTert), represent a rarer 
cell population””. These cells form a descending gradient from proximal 
to distal regions of the intestine, such that they are more prevalent in the 
duodenum than in the colon. Despite their rarity, Bmil-expressing stem 
cells are crucial for crypt maintenance’. 

To study the function of Lgr5-expressing cells, we replaced the first 
coding exon of Lgr5 with two distinct cassettes. The first consisted of a 
dsRED-IRES-CreERT2 sequence to enable genetic lineage tracing 
studies by tamoxifen (TAM)-inducible expression of Cre in Lgr5- 
expressing cells (Supplementary Fig. 1a, Lgr5“"*P* allele). The second 
cassette contained enhanced green fluorescent protein (EGFP) linked in 
frame to a human DTR cDNA (Supplementary Fig. 1a, Lgr5°"* allele), 
producing a fusion protein. Consistent with previous reports', one 
injection of TAM in Lgr5“*";R26R mice marked Lgr5-expressing stem 
cells in a mosaic fashion and led to generation of labelled progeny for 
more than 60 days (Supplementary Fig. 1b). Expression of DT[R-EGFP 


in Lgr5'® mice functioned as a reporter for Lgr5 expression (Fig. 1a) 


and also conferred diphtheria toxin (DT) sensitivity on CBCs. 
Expression of EGFP in mice carrying the Lgr5'™ allele was detected 
at the membrane of cycling CBCs in every crypt (Supplementary Fig. 
1c-e, CBCs are marked by asterisks). 

We next set out to test the effects | of eliminating Lgr5-expressing 
cells by administering DT to Lgr5”'™ mice. Twenty-four hours after 
DT administration, all EGFP-positive cells were depleted, including 
CBCs (Fig. 1a, b, j, k, p, q). Loss of Lgr5-expressing cells was further 
confirmed by the absence of Lgr5 messenger RNA (Fig. 1d, e) and was 
accompanied by extensive apoptosis at the base of the crypts, with 
shedding of dead cells into the lumen (Fig. 1m, n). 

After sustained DT exposure for 10 days, both the EGFP reporter 
and Lgr5 mRNA were completely absent from the base of the crypts 
(Fig. 1c, f and Supplementary Fig. 2) but, notably, crypt architecture 
was comparable to controls (Fig. 1g, i, j, 1). Proliferating CBCs were 
absent from the crypt (Fig. 1], r), such that the crypt base was occupied 
mostly or entirely by Paneth cells (Supplementary Fig. 3a, b). The 
extensive apoptosis detected 24h after DT treatment had significantly 
decreased by day 10 (compare Fig. In with o) but was still detectable. 
No increase in crypt fission after DT treatment was observed by hae- 
matoxylin and eosin staining at any time point (Fig. 1g-i). 

Because Lgr5-expressing cells have been proposed to have a critical 
role in renewal of the intestine, it was surprising that the architecture of 
the intestinal epithelium was essentially intact after ablation of Lgr5- 
expressing CBCs (Fig. 1g-i). Within the villi, very little change in the 
total number of endocrine cells was observed (Supplementary Fig. 3c, d), 
and goblet cells were abundant in the crypts and villi (Supplemen- 
tary Fig. 3g, h, j). Upon CBC ablation, Paneth cells were found at the 
bottom of the crypts and in some cases were mislocalized to the villi 
(Supplementary Fig. 3a, b and data not shown); additionally, migration 
of cells as assessed by BrdU pulse-chase labelling was normal (Sup- 
plementary Fig. 4). The only major difference from controls that we 
observed was in the secretory cell lineage; the number of chromogranin- 
A-positive enteroendocrine cells in the crypts doubled after DT admin- 
istration for 10 days (Supplementary Fig. 3e, f, i). 

We did not detect any Lgr5-expressing CBCs using either the EGFP 
reporter or in situ hybridization after 10 days of DT (Fig. 1c, f and 
Supplementary Fig. 2), but it was still possible that a few CBCs could 
have escaped ablation and repopulated the epithelium, as a similar 
scenario was reported in c-Myc and Ascl2 conditional null mice’?”. 
To address this possibility directly, we visualized i Lgrs- expressing cell 
activity during DT selection by producing Lgr5°™’"F®;R26R mice. 
These mutant mice carried two null alleles at the Lgr5 locus, of which 
one enabled ablation of Lgr5-expressing cells and the other enabled 
lineage tracing of any possibly remaining Lgr5-expressing cells. These 
mice died at postnatal day (P)1, consistent with previous reports that 
Lgr5 null mice are not viable’. To analyse the postnatal al gut, ut, we grew 
pieces of small intestine from embryonic day (E)15 Lgr5°"/"F®:R26R 
embryos under the kidney capsule of immunocompromised mice for 
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Figure 1 | Characterization of DT-mediated CBC ablation. a, EGFP is 
detected on the membrane of Ki67~ proliferating CBCs in saline-treated 
Lgrs? 1 mice. b, One dose of DT eliminates all DTR-EGFP-positive cells at 
24h.c, DT treatment for up to 10 days prevents reappearance of Lgr5-expressing 
cells. d-f, Lgr5 mRNA is normally present at the bottom of the crypts (d) and is 
not detected after 24h (e) or 10 days of DT treatment (f). g-i, Crypt architecture 
is intact after ablation of Lgr5-expressing CBCs. H&E, haematoxylin and eosin. 
j-l, Proliferation above the crypt base is normal after ablation of Lgr5-expressing 
CBCs. m-o, Extensive apoptosis is observed at the crypt base 24h after DT and 
tapers off by 10 days, but is still higher than controls. “Cl. caspase 3’ is cleaved 
caspase 3. p-r, Electron microscopy shows that CBCs in controls are 
characterized by slender nuclei and scant cytoplasm. No CBCs remain at the 
crypt base after one dose or 10 days of DT treatment. The crypt base is filled with 
granule-rich Paneth cells. TEM, transmission electron microscopy. s, Dosing 
regimen for study of the recovery of Lgr5-expressing CBCs. t, No CBCs are 
detected 24h after DT administration. D, day. u, A few Lgr5*/ Ki67* CBCs 
(arrow) were detected 48 h after the last dose of DT. v, More Lgr5*/Ki67~ CBCs 
(arrows) recovered after 96 h. Original magnification for panels: a-f, m-o and 
t-v at 40; g-i at 20; j-l at 63; and p-r at 2,650. 


three weeks, at which point they formed crypts comparable to P17 
intestine (Fig. 2a-e)'*. After 10 days of TAM treatment, columns of 
blue cells emanated from the crypt base, and progeny of Lgr5-expressing 
cells differentiated into all four major cell types of the intestinal 
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Figure 2 | Maintenance of normal crypt architecture is not mediated by 
Lgr5-positive cells that have escaped ablation. a, Ten-day lineage tracing of 
descendants of Lgr5-expressing stem cells shows a blue ribbon emanating from 
the base of the crypt in a grafted intestine piece from E15 Lgr5°1 Or? 
embryos. b-e, Normal proliferation and differentiation of intestinal epithelium 
after loss of Lgr5 gene function. Lgr5-expressing stem cells can give rise to all 
four major differentiated cell types (arrows). X-GAL-positive cells mark Lgr5- 
positive stem cell progeny, which overlap with differentiated cell markers for 
goblet (c), Paneth (d) and endocrine cell (e) lineages. PAS, periodic acid Schiff; 
ChrA, Chromogranin A. f, Concurrent TAM and DT treatment kills all Lgr5- 
expressing cells. No progeny of Lgr5-expressing cells (blue) are detected in the 
grafted intestine. g-j, No GFP-positive cells are detected but proliferation and 
differentiation are normal after DT-mediated ablation of Lgr5-expressing 
CBCs. Original magnification for panels: a, b, f, g at 40x; c-e, h-j at 63x. 


epithelium (Fig. 2a-e). Concomitant administration of DT and 
TAM for 10 days eradicated all EGFP-positive CBCs (Fig. 2g), and 
no cells descended from Lgr5-expressing cells were observed 
(Fig. 2f), confirming that the Lgr5”'™ allele leads to complete elimina- 
tion of these cells. Importantly, no abnormalities in graft morphology, 
differentiation or proliferation were observed in these mice compared 
to controls (Fig. 2a-j). 

Although Lgr5-expressing cells were completely depleted within 
24h of DT treatment, persistence of apoptotic bodies at the crypt base 
throughout the 10-day DT treatment suggested that Lgr5-expressing 
CBCs were continuously generated and eliminated during the treat- 
ment (Fig. 1n, 0). This notion was supported by the quick recovery of 
Lgr5-expressing cells between 48 to 96h after the final dose of DT 
(Fig. 1s—v). To follow the fate of the newly generated Lgr5-expressing 
cells, mice implanted with beg en ;R26R embryonic intestine 
fragments in the kidney capsule were allowed to recover for 6 days 
in the presence of TAM following 6 days of DT treatment. A row of 
blue cells emanated from the crypt base (Supplementary Fig. 5a), indi- 
cating that the newly formed Lgr5-expressing stem cells (Supplemen- 
tary Fig. 5b, GFP-positive cells) gave rise to progeny that migrated 
out of the crypt. When the converse experiment was performed by 
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injecting TAM for 6 days and then dosing with DT from days 6 to 12, 
blue cells were only present in the upper region of the villi (Sup- 
plementary Fig. 5c), indicating that progeny of Lgr5-expressing cells 
marked between day 1 and 6 migrated out of the crypts into the villi, but 
that during DT treatment between days 6 and 12, Lgr5-expressing stem 
cells were no longer available (Supplementary Fig. 5d, absence of GFP 
signal) to supply labelled (blue) progeny to replenish the epithelium. 

To study the long-term effects of CBC ablation, we isolated crypts 
from Lgr5°"* mice to perform in vitro crypt organoid cultures". 
Crypts depleted of Lgr5-expressing CBCs by treatment for 10 days with 
DT, as indicated by absence of GFP expression, gave rise to organoids 
with similar efficiency as wild-type controls (Supplementary Fig. 6a, 
b). These could be passaged in vitro in DT for up to 2 months without 
losing their ability to expand and proliferate. No Lgr5-expressing 
(GFP-positive) cells were detected in organoid epithelium as long as 
the organoids were maintained in medium containing DT (Sup- 
plementary Fig. 6d). However, when DT was removed from the culture 
medium, Lgr5-expressing cells reappeared at the bottom of crypt-like 
structures within 3 days (Supplementary Fig. 6c, GFP-positive cells). 

Because we found that Lgr5-expressing CBCs were dispensable for 
crypt maintenance, we next asked whether Bmil-expressing stem cells 
were mobilized to compensate for the loss of the Lgr5-expressing stem 
cells. Mouse BMI1 regulates self-renewal of haematopoietic and neuronal 
stem cells'®. We used a GEP knock-in allele (Bmi1@*"’*) to monitor Bmil 
gene expression’®. Bmil-expressing GFP-positive cells were most com- 
monly observed at positions 3 to 6 above the crypt base (Fig. 3a), con- 
sistent with the Bmil mRNA expression pattern in the small intestine’. 
Upon depletion of Lgr5-expressing CBCs in Lgr5°™*;BmilSF’* mice 
after 9 days of DT treatment, the number of GFP-positive cells per crypt 
increased three fold (Fig. 3a-d and Supplementary Fig. 7a), and the 
proportion of crypts containing either single or multiple GFP-expressing 
cells increased by 40% compared to control animals (Supplementary 
Fig. 7b). Of note, 55% of the total number of GFP-positive crypts in 
Ler5?'®*BmilS'*’* mice now contained multiple GFP-positive cells 
(Fig. 3d and Supplementary Fig. 7b), compared with only 22% in 
Bmil“*’* control animals. 

To trace the fate of cells descended from Bmil-expressing cells 
after elimination of Lgr5-expressing CBCs, we generated a BmilCreER 
bacterial artificial chromosome (BAC) transgenic allele (Supplemen- 
tary Fig. 8). Labelling kinetics using the Bmil-CreER transgenic line 
crossed with the R26R reporter were identical to previously reported 
results using the Bmil CreER knock-in allele? (Fig. 3f). Bmil- 
CreER;R26R;Lgr5”'** animals were treated with alternating doses of 
DT and TAM per day for 7 days (Fig. 3e). Because Bmil-expressing cells 
are most abundant in the first 5 cm of the duodenum, we focused our 
analysis on this region. Consistent with the increased number of Bmi1- 
expressing cells (Supplementary Fig. 7a), the proportion of LacZ- 
positive crypts (either partially or fully labelled) also increased 34% 
upon loss of Lgr5-expressing CBCs (Supplementary Fig. 7c). The most 
marked difference was in the number of fully labelled crypts. Only 2.3% 
of crypts were fully labelled in BmilCreER;R26R control mice during a 
6-day lineage tracing period, which was comparable with previous 
studies using a Bmil“"" knock-in allele”. Upon loss of Lgr5-expressing 
CBCs, the number of fully labelled crypts increased approximately 15- 
fold (Fig. 3h, iand Supplementary Fig. 7c). These results indicate that in 
the absence of Lgr5-expressing cells, Bmil-expressing cells are capable 
of directly giving rise to all intestinal cell types without going through 
Lgr5-positive intermediate cells. However, Bmil-expressing stem cells 
did not give rise to an increased number of labelled crypts in more distal 
regions of small intestine and colon upon loss of Lgr5-expressing CBCs 
(Fig. 3f, g), indicating that alternative stem cell pools must compensate 
for the loss of Lgr5-expressing stem cells in distal regions of the gut. 

Lastly, we tested whether Bmil-expressing cells give rise to Lgr5- 
expressing cells under normal conditions. Because Bmil- and Lgr5- 
expressing cells represent distinct although overlapping cell popula- 
tions, we carried out a series of short-term pulse-chase experiments 


LETTER 


Bmi1GFP/+ Bmit SFP/+-1 gr5OTR/+ 


Colon Jejunum lleum ~ Colon 


Duodenum = Jejunum _ Ileum 


Figure 3 | Bmil-expressing stem cells are mobilized to compensate for the 
loss of Lgr5-expressing CBCs. a, Rare Bmil-expressing cells (arrows) are 
detected at positions 3 to 6 of the crypt base in the duodenum of Bmil@P’* 
reporter mice. b, Increased Bmi1-expressing cells appear at the crypt base upon 
ablation of Lgr5-expressing CBCs. c, Higher magnification showing a Bmi1- 
expressing cell at position 4 of crypt base in Bmil°"?’* reporter mice. d, Close- 
up view of a crypt with multiple Bmi1-expressing cells after ablation of Lgr5- 
expressing cells. Arrows in a-d indicate (Bmil)-expressing cells. e, Dosing 
regimen for lineage tracing of Bmil-expressing cell progeny after ablation of 
Lgr5-expressing CBCs. H, harvest. f, g, Whole-mount X-GAL staining of the 
gastrointestinal tract. In both control mice and after ablation of Lgr5-expressing 
CBCs, Bmi1-expressing cells produce more progeny in the proximal than in the 
distal intestine. h, i, Close-up view of X-GAL-positive crypts in duodenum. 
Most of the labelled crypts have less than five X-GAL-positive cells in Bmi1- 
CreER;R26R control animals. Ablation of Lgr5-expressing CBCs stimulates 
production of progeny by Bmil-expressing cells. 36% of the crypts in the first 
5 cm of duodenum now become fully labelled (marked by arrows). Original 
magnification for panels: a-d at 40; f, g at 1.2X;h, i at 20x. 


using Bmil-CreER;R26R;Lgr5°'* mice. Twenty-four hours after 
TAM administration, most of the B-galactosidase (B-gal)-positive cells 
appeared as individuals, reflecting the normal pattern of Bmi1 expres- 
sion (Fig. 4a) in the initially labelled cells. Bmil-expressing cells (B-gal 
positive) overlapped with Lgr5-expressing cells (GFP positive) 
between positions 1 to 6 at the crypt base; the double-positive cells 
peaked at positions 3 and 4 (Fig. 4a—c). This observation is consistent 
with a previous report stating that Bmil mRNA expression (via quanti- 
tative polymerase chain reaction (qPCR) analysis) was readily detect- 
able in Lgr5-positive cells’. Later, between 48-72 h, clonal expansion 
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Figure 4 | Bmil-expressing cells give rise to Lgr5-expressing CBCs under 
normal and injury conditions. a—c, Bmi 1-CreER;R26R;Lgr5°* mice were 
dosed with 5 mg TAM and harvested 24h later. §-Gal-positive cells (red) 
derived from Bmil-expressing cells overlap with Lgr5-expressing CBCs (GFP- 
positive, green) at position 4 (arrow). A non-overlapping B-gal-positive cell was 
detected at position 7 in the same crypt (asterisk). d-f, More B-gal-positive cells 
(red) show overlapping expression (marked by arrow) with Lgr5-expressing 
CBCs (GFP *, green) at 48 h. g-i, At 72 h, clonal expansion from Bmi1-positive 
stem cells is now evident by a streak of f-gal-positive cells migrating upward 
(red). B-Gal-positive clones at lower crypt positions overlap with Lgr5- 


from Bmil-expressing cells was evident, as B-gal/GFP double-positive 
cells now appeared as doublets or triplets (Fig. 4d-i). We scored a total 
of 500 crypts at each time point and found that although a few cells 
were f-gal/GFP double positive (that is, expressing both Bmil and 
Lgr5) at 24h after TAM induction, this number doubled at 48h 
(Fig. 4j, k). Similarly, lineage tracing from Bmil-expressing cells 
carried out in mice treated for 6 days with DT and allowed to recover 
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expressing CBCs (arrow). j, k, Distribution of the Bmil-positive stem cell 
progeny (B-gal* cells) within the crypt at 24 and 48 h after TAM induction. 
j, Bmil-expressing cells appear as singles throughout the crypt base between 
positions 1 to 15. k, More cells are derived from Bmi1-expressing stem cells at 
48h. A significant portion of B-gal* cells also express Lgr5 (GFP*, green 
column), at positions 1 to 6. Overlapping cells (green) peak around positions 3, 
4 or 5.1, Dosing regimen used to study the recovery of Lgr5-expressing CBCs 
from Bmil-positive cells. m-o, Bmil-positive cells give rise to a fully labelled 
crypt (red), including newly formed Lgr5-expressing CBCs (GFP, arrows). 
The original magnification for all panels is 40x. 


for 72 h (Fig. 41) demonstrated that newly formed Lgr5-positive cells at 
the bottom of the crypts arose from Bmil-expressing cells (Fig. 4m-o). 
Together, these data show that Bmil-expressing cells can give rise to 
Lgr5-expressing cells both under normal physiological conditions and 
after insults that deplete CBCs. Similar to our observation, mTert- 
expressing stem cells could also give rise to Lgr5-positive cells over a 
5-day lineage tracing period”. 
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Our data support the existence of two stem cell pools in the epithelium 
of the small intestine: an actively proliferating stem cell compartment 
responsible for the daily maintenance of the intestine epithelium that is 
characterized by the expression of Lgr5, Ascl2 and Olfm4 (refs 1, 11, 17) 
and a distinct pool of stem cells expressing Bmil. Our results lend 
support to the two-stem-cell pool model that is based on computational 
approaches", and provide experimental evidence for recent models 
predicting that the intestine could fully recover after complete elimina- 
tion of cellular subpopulations deemed to be functional stem cells’. 
Our data do not support the recent proposal that Bmil-expressing cells 
are exclusively a subset of Lgr5-expressing cells’’; rather they indicate 
that under normal circumstances, Bmil-positive stem cells are 
upstream of rapidly cycling, Lgr5-expressing stem cells and replenish 
the pool of active stem cells, either to avoid exhaustion of actively 
cycling stem cells or to prevent the accumulation of damaged cells 
that may lead to the development of tumours. Importantly, we also 
demonstrate that when the Lgr5-expressing cell compartment is elimi- 
nated by DT treatment, Bmi1-expressing cells can increase in number, 
presumably as a compensatory mechanism. Under these conditions, 
Bmil-expressing cells contribute directly to the generation of all cell 
types of the intestinal epithelium to produce a functional organ until 
the rapidly cycling stem cell compartment is able to recover. Although 
it has been proposed that Bmil-expressing stem cells are quiescent’, 
this remains to be conclusively demonstrated. 

Distinct stem cell pools with differing cycling dynamics have previ- 
ously been observed in the hair follicle and in blood, organs that, like 
the intestine, undergo regular bouts of proliferation and regenera- 
tion*’’. The factors that regulate the interplay between discrete popu- 
lations of stem cells, and the precise hierarchical relationships among 
such populations, remain to be characterized. Although we have found 
that loss of Lgr5-positive cells is sustainable under short-term condi- 
tions in vivo, it remains to be determined whether such a scenario can 
persist for longer periods of time. Interestingly, depletion of Paneth 
cells, which are thought to be important for the maintenance of CBCs’, 
can be tolerated by mice for over 6 months without significant struc- 
tural defects of the epithelium”, supporting the idea that the intestine 
can function normally in the absence of CBCs. It will be important to 
determine how different stem cell populations sense the activity of 
other populations, whether rapidly cycling cells can repopulate more 
quiescent stem cell populations, and whether additional subpopula- 
tions of stem cells exist. 


METHODS SUMMARY 

Mice. Lgr5?™®’*, LersF®’* and Bmil-CreER alleles were generated as described 
in Methods. Bmil°'’’* mice were provided by I. Weissman’*. All studies and 
procedures involving animal subjects were approved by the Institutional Animal 
Care and Use Committees of Genentech and the University of California, San 
Francisco, and were conducted strictly in accordance with the approved animal 
handling protocol. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Lgr5 and Bmil vector construction. The constructs for targeting the C57BL/6 
Lgr5 locus and the Bmil BAC transgene were made using a combination of 
recombineering, DNA synthesis and standard molecular cloning techniques””®. 

For Lgr5, a 7,213 bp fragment (assembly NCBI37/mm9, chr10:115,020,315- 
115,027,527) from a C57BL/6 BAC (RP23 library) was first retrieved into plasmid 
pBlight-TK”. To generate the DIR-EGFP KI vector for Lgr5,a DTR-EGFP-pA- 
loxP-Neo-loxP cassette was synthesized (Blue Heron/Origene, DTR-EGFP 
sequence was based on that described previously’, and inserted right after the 
ATG of Lgr5 (chr10:115,024,547, reverse strand), deleting the remainder of exon 1 
and splice donor of intron 1 (a 212 bp deletion). To generate the CreERT2 KI 
vector, a dsRed2-IRES-CreERT2-pA-Frt-neo-Frt cassette was synthesized (Blue 
Heron/Origene) and inserted at the same position as the DTR-EGFP cassette. The 
final vectors were confirmed by DNA sequencing. 

The Lgr5 KI vectors were linearized with Not] and C57BL/6 C2 embryonic 
stem cells were targeted using standard methods (G418-positive and gancyclovir- 
negative selection). Positive clones were identified using PCR and taqman analysis, 
and confirmed by sequencing of the modified locus. Correctly targeted embryonic 
stem cells were transfected with a Cre or Flpe plasmid, respectively, to remove the 
Neo cassette. The modified embryonic stem cells were then injected into blastocysts 
using standard techniques, and germline transmission was obtained after crossing 
the resulting chimaeras with C57BL/6N females. 

For Bmil, a 210kb C57BL/6 BAC (RP23-181D14, assembly NCBI37/mm9, 
chr2:18,464,619-18,674,471) was obtained and characterized by DNA fingerprint- 
ing. The BAC contains the Bmil locus and considerable 5’ and 3’ flanking 
sequence. An IRES-CreERT2-pA-frt-Neo-frt cassette was synthesized (Blue 
Heron/Origene) and inserted, using recombineering, 85 bp 3’ of the Bmil stop 
codon (after position chr2:18,606,193). Neo was then removed by transforming 
the modified BAC into arabinose-induced, SW105 cells**”? expressing the yeast 
protein Flp. C57BL/6 transgenic mice carrying the modified Bmil BAC were 
obtained using standard pronuclear microinjection methods” and characterized. 

We analysed the Lgr5°"™* mice at 24h after DT administration (50 gk ', 
intraperitoneal injection, n = 3), at 10 days of DT treatment (50 p1gkg ' every 
other day for 10 days, n = 5), 48h recovery (n = 3) and 96h recovery time points 
after four doses of DT (n = 2). The DT treatment could not be extended beyond 10 
days due to severe liver toxicity apparently mediated by a subset of Lgr5-DTR- 
EGFP-expressing hepatocytes. We analysed BmilCreER;R26R;Lgr5-'* at 24h 
(n = 2), 48h (n = 2) and 72h (n = 2) after TAM injection. Two-hundred and fifty 
B-gal-positive crypts were scored per mouse. 

Renal capsule explants. 3-5-mm small intestine pieces from E15 Lgr5 
embryos (n = 3) were grafted under the renal capsule of 6-8-week-old athymic 
nu/nu mice and allowed to develop for 3 weeks. We treated Lgr5? TR/CTeER. RIG6R 
renal grafts with 10 days TAM (n= 5), 10 days DI/TAM (n=5), 6 days DT 
followed by 6 days TAM (n=5) and 6 days TAM followed by 6 days DT 


-DTR/CreER 


(n = 5). Some GFP expression was seen outside of the CBC region due to perdur- 
ance of GFP protein as well as upregulation of the Lgr5 locus when Lgr5 is deleted’. 
DT cell ablation. Mice (between 6 and 12 weeks old) were given DT at 50 pgkg 
every 48 h through intraperitoneal injections. 

TAM labelling experiments. Mice (between 6 and 12 weeks old) were given 5 mg 
TAM in corn oil through intraperitoneal injection. 

Transmission electron microscopy. The tissues were fixed in 1/2 Karnovsky’s 
fixative (2% paraformaldehyde (PFA), 2.5% glutaraldehyde in 0.1 M sodium caco- 
dylate buffer, pH 7.2), washed in the same buffer, and post-fixed in 1% osmium 
tetroxide. The samples were then dehydrated through a series of ethanol, followed 
by propylene oxide and embedded in Eponate 12 (Ted Pella). Thin sections were 
stained with uranyl acetate and lead citrate and examined using a Philips CM12 or 
JEOL JEM-1400 TEM. 

Histology, immunohistochemistry and immunofluorescence. Animals were 
perfused with 2% PFA. Small intestine and colon were flushed with 2% PFA 
and fixed in 4% PFA overnight. Half of the materials were cryo-protected, embed- 
ded in OCT, and sectioned at 6 tum for immunofluorescence. The other half of the 
materials were paraffin embedded, sectioned at 3 j1m for histology and immuno- 
histochemistry. Antibodies: Ki67 (Neomarker), cleaved caspase 3 (Cell Signaling), 
GFP (Novus), chromogranin A (Neomarkers), -gal (Cappel). 

In situ hybridization and X-GAL staining. Full-length Lgr5 cDNA was cloned 
into the pGEM vector to make anti-sense DIG-probe. Protocols for in vitro tran- 
scription and in situ hybridization were as described previously*'. Whole-mount 
X-GAL staining was performed as described’. 

Crypt organoid culture. Crypt isolation and culture were performed as 
described". 
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ATP-induced helicase slippage reveals highly 


coordinated subunits 


Bo Sun!*, Daniel S. Johnson!*+*, Gayatri Patel, Benjamin Y. Smith'*, Manjula Pandey’, Smita S. Patel? & Michelle D. Wang’? 


Helicases are vital enzymes that carry out strand separation of 
duplex nucleic acids during replication, repair and recombina- 
tion’’. Bacteriophage T7 gene product 4 is a model hexameric 
helicase that has been observed to use dTTP, but not ATP, to 
unwind double-stranded (ds)DNA as it translocates from 5’ to 3’ 
along single-stranded (ss)DNA” °. Whether and how different sub- 
units of the helicase coordinate their chemo-mechanical activities 
and DNA binding during translocation is still under debate’’. 
Here we address this question using a single-molecule approach 
to monitor helicase unwinding. We found that T7 helicase does in 
fact unwind dsDNA in the presence of ATP and that the unwinding 
rate is even faster than that with dTTP. However, unwinding traces 
showed a remarkable sawtooth pattern where processive unwind- 
ing was repeatedly interrupted by sudden slippage events, ulti- 
mately preventing unwinding over a substantial distance. This 
behaviour was not observed with dTTP alone and was greatly 
reduced when ATP solution was supplemented with a small 
amount of dTTP. These findings presented an opportunity to 
use nucleotide mixtures to investigate helicase subunit coordina- 
tion. We found that T7 helicase binds and hydrolyses ATP and 
dTTP by competitive kinetics such that the unwinding rate is dic- 
tated simply by their respective maximum rates V,,,,, Michaelis 
constants Ky and concentrations. In contrast, processivity does 
not follow a simple competitive behaviour and shows a cooperative 
dependence on nucleotide concentrations. This does not agree 
with an uncoordinated mechanism where each subunit functions 
independently, but supports a model where nearly all subunits 
coordinate their chemo-mechanical activities and DNA binding. 
Our data indicate that only one subunit at a time can accept a 
nucleotide while other subunits are nucleotide-ligated and thus 
they interact with the DNA to ensure processivity. Such subunit 
coordination may be general to many ring-shaped helicases and 
reveals a potential mechanism for regulation of DNA unwinding 
during replication. 

Despite the fact that most motor proteins use ATP as a fuel source, 
previous bulk studies have shown that T7 helicase does not unwind 
DNA efficiently in the presence of ATP, although it is capable of ATP 
hydrolysis***. To investigate why ATP seemed not to support T7 
helicase unwinding, we used a single-molecule optical trapping assay 
that we previously developed to measure unwinding of dsDNA or 
translocation on ssDNA (Fig. 1a and Supplementary Fig. 1)’. Briefly, 
two strands of a DNA fork junction were held under tension that was 
not sufficient to mechanically unwind the junction without a helicase. 
Helicase unwinding of the junction resulted in an increase in the 
ssDNA length, permitting tracking of the helicase location. When 
experiments were conducted with 2mM ATP, we were surprised to 
find that ATP supported not only dsDNA unwinding but that it also 
supported it at a significantly faster rate than with dTTP (Fig. 1b-c). 
However, processive unwinding was interrupted by slippage events, 
resulting in a remarkable sawtooth pattern in the unwinding trace 


(Fig. 1b). Control experiments verified that each trace was the action 
of a single helicase (Supplementary Fig. 2). We attribute this pattern to 
helicase losing its grip on the ssDNA, sliding backwards under the 
influence of the reannealing DNA fork, and then regaining its grip 
and resuming unwinding (Fig. 1d). In contrast, slippage behaviour was 
essentially absent with 2mM dTTP alone (Fig. 1b). These results 
resolve the mystery of the apparent lack of significant unwinding 
activity seen in bulk studies*®*; unwinding and slippage could not 
be separated, so unwinding was masked by unobservable slips that 
prevented helicase from moving over a substantial distance. Our work 
is the first direct observation, to our knowledge, of helicase nucleotide- 
specific slippage. Previous studies of non-ring-shaped helicases have 
reported reverse motions of the unwinding fork attributable to helicase 
reaching the end of the DNA or encountering a barrier’®”’, dissociating 
from the DNA™"’, or moving in the reverse direction®’*'’. These are of 
a somewhat different nature than what we have observed. The only 
slippage behaviour that may resemble ours is from non-helicase bac- 
teriophage motors’*"*, but their slippage is not a result of the use of a 
specific nucleotide. 

Slippage was not observed with dTTP alone (Fig. 1b) and therefore 
seems to be sensitive either to the base composition of the bound nuc- 
leotide (for example, adenosine versus thymidine) or the type of sugar 
(ribose versus deoxyribose). We compared slippage for all four NTPs and 
their dNTP counterparts (Supplementary Fig. 3). For each nucleotide we 
measured processivity, defined as the mean distance between slips 
(Supplementary Fig. 4). The results indicate that the additional 
2'-OH group on the ribose sugar makes the helicase more prone to 
slipping. Examination of the helicase structure at the nucleotide-binding 
pocket"® reveals that the 2’-OH group of a bound nucleotide may dis- 
place the -OH group on the side chain of residue Y535 (Supplementary 
Fig. 5a). We thus generated a Y535F mutant to remove the -OH group 
and it showed significantly increased processivity in the presence of ATP, 
albeit still less than that seen for dATP (Supplementary Fig. 5b). 

Although ATP caused helicase to slip more frequently, it supported a 
much faster unwinding rate between slips, consistent with an earlier 
finding of a faster rate of ATP hydrolysis’’. Because ATP and dTTP 
support different unwinding rates and processivities, we used nucleo- 
tide mixtures to understand how multiple subunits of the helicase 
coordinate unwinding activity. We approximated the in vivo concen- 
trations of ATP and dTTP of Escherichia coli'* by using 2.0 mM ATP 
anda small amount of dTTP, 0.2 mM (Fig. 1b, c). Although the unwind- 
ing rate between slips was close to the value observed with 2mM ATP 
alone, the processivity increased by approximately threefold. When the 
converse experiment was performed (0.2 mM ATP and 2.0 mM dTTP), 
the unwinding rate was comparable to that with 2 mM dTTP alone and 
minimal slippage was observed (Fig. 1b, c). These results imply that 
even a small fraction of helicase subunits, when bound with dTTP, 
reduce slippage and substantially increase processivity. This finding 
was further substantiated by bulk experiments using ATP alone, and 
an ATP/dTTP mixture (Supplementary Fig. 6). To determine if T7 
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Figure 1 | Comparison of helicase unwinding behaviours with different 
nucleotides. a, Schematic of the single-molecule configuration (not to scale). 
The single-stranded ends of a dsDNA were held at a constant unzipping force of 
8 pN while T7 helicase unwound the dsDNA by translocating on ssDNA. 

b, Representative traces showing the number of unwound base pairs versus 
time in the presence of various concentrations of nucleotides. For clarity, traces 
have been arbitrarily shifted along both axes. c, A summary of unwinding rates 
and processivities. Uncertainties are s.e.m. d, Cartoon illustrating slippage 
behaviour. The helicase unwinds, loses grip, slips, re-grips and resumes 
unwinding. Dotted helicase indicates a previous location of the helicase. 


helicase binds DNA with different affinities in the presence of dTTP and 
ATP, bulk binding studies were carried out using fluorescence aniso- 
tropy with dTTP and ATP analogues (Supplementary Fig. 7). The 
results show that T7 helicase binds ssDNA 100-fold more tightly with 
dTMPPCP than with AMPPCP, and indicate that the greater slippage 
in the presence of ATP is probably due to weaker binding to DNA. 
The discovery of helicase slippage and the ability to directly measure 
helicase processivity provided a unique opportunity to explore the fol- 
lowing: (1) how ATP and dTTP compete for binding to helicase sub- 
units; (2) how nucleotide binding regulates helicase affinity to DNA; 
and (3) how multiple subunits of helicase coordinate their activities. 
To understand how ATP and dTTP compete for binding to helicase 
subunits, we determined the unwinding rates between slippage events 
(Fig. 2a) as a function of nucleotide concentration. For each nucleotide 
alone, the unwinding rate followed Michaelis-Menten-like kinetics, 
yielding Vinax and Ky, values that were both higher for ATP than for 


2 | NATURE | VOL 000 | 00 MONTH 2011 


a Unwinding between slips 
a [500 bp rofl 
ae 
zm To 
i en oy 
i 6: 
os 
& 
=| 
0 2 4 6 8 fo 12 14 
Time (s) 
b dTTP ATP 
= _3504 
11204 - 
g e 280 
ro 2 0104 
DoD 4 
Zo : £140; 
3 Vinax (OP S“1)= 15344 3 Vinax (bP S“!) = 442 + 22 
= 307 = 704 
é Ky (mM) =0.95+0.07 = Ky (mM) = 1.81 + 0.24 
0 : 0 
0 1 2 3 4 0 1 2 3 4 
[dTTP] (mM) [ATP] (mM) 
c 
3504 
= 1507 = ATP] = 4 mM 
i; [dTTP] = 2 mM = | é a 
1205 a i 3 
oO oO 
3 904 3 2104 
2 604 [TTP] =0.5mM —s = 1404 [ATP] = 1.5 mM 
ne} 2 
& & 
2 304 = 704 
=) =. 
o+— , , T T T o+— : : T 
0.0 01 02 03 04 05 0.0 0.1 0.2 0.3 
[ATP] (mM) [dTTP] (mM) 
2 ATP a E-ATP a E 
= Es 
S= 
. ae Pi ADP 
QTR as 
ky k,aTTP 
~~ 2 
E-dTTP E 
k_,oTTP 
Pi dTDP 


Figure 2 | Helicase unwinding kinetics. a, Example of unwinding with ATP to 
illustrate the method of determining unwinding rate by analysing data between 
slips. b, Kinetic constants for unwinding under a constant unzipping tension of 
8 pN in the presence of either ATP (right) or dTTP (left). For each nucleotide, Ky 
and Vinax were obtained by fitting the unwinding rates as a function of NTP 
concentration to the Michaelis-Menten equation. c, Measured unwinding rates at 
either fixed [dTTP] and varying [ATP], or fixed [ATP] and varying [dTTP], and 
comparison with direct predictions (not fits) from the competitive nucleotide 
binding model using kinetic constants Ky and V,yax shown in b. Error bars 
indicate s.e.m. d, Kinetic pathway of a competitive binding model where ATP and 
dTTP compete for binding and hydrolysis by the helicase (denoted by E here). 


dTTP (Fig. 2b). These kinetics indicated that there was no cooperativity 
in NTP binding and hydrolysis. Next, we conducted experiments in 
which the concentration of one nucleotide was fixed while that of the 
other nucleotide was varied. The resulting unwinding rates could be 
explained by competitive kinetics: ATP and dTTP compete for binding 
based on their respective affinities and the resulting reaction rate is 
determined by their concentrations, V,ya,,and Ky (Fig. 2c, d; Methods 
Summary and Supplementary Discussion). A comparison of unwind- 
ing rates with mixed nucleotides and direct predictions (not fits) from 
the competitive binding kinetics showed excellent agreement. These 
results were further substantiated by ssDNA translocation rate experi- 
ments (Supplementary Fig. 8). This also explains why in Fig. 1b, c the 
unwinding rate was minimally altered when 0.2mM of dTTP was 
added to 2mM ATP. Under those conditions, only about 16% of the 
nucleotide bound to the helicase hexamer was dTTP. 

The competitive binding kinetics for nucleotides, however, does not 
explain the observed slippage behaviour with mixed nucleotides 
(Fig. 1b, c). That is, it is unclear how the 16% bound dTTP resulted 
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in a threefold increase in processivity. If only a single nucleotide can be 
bound by the helicase at a time and the type of the bound nucleotide 
determines the helicase’s affinity to the DNA, then processivity should 
only increase by 7% (Supplementary Discussion). In addition, it has 
previously been shown that the helicase subunits do not bind to 
ssDNA in the absence of a nucleotide'’. However, we found minimal 
slippage even at [dTTP] much below its Kyy. These observations indi- 
cate participation of multiple subunits in both nucleotide and DNA 
binding, where each subunit would have a nucleotide-specific DNA 
binding affinity. Our data indicate that helicase may not slip if at least 
one subunit of the hexamer is in a deoxythymidine-ligated state, which 
has a higher affinity for the DNA. 

Two models may be consistent with this idea. In an uncoordinated 
model'’, each helicase subunit functions independently in its nucleo- 
tide binding/hydrolysis, and DNA binding/release (Supplementary 
Discussion). Conversely, coordinated models have been proposed 
for T7 helicase’”’, but details of the coordination remain unclear. 
Biochemical and structural studies indicate that nucleotide hydrolysis 
may occur sequentially around the hexameric ring’®*°”’, that roughly 
four subunits are nucleotide-ligated at any given time”, and that DNA 
binding to the helicase might involve one-to-two helicase subunits’®?? >, 
A model based on structural studies has been proposed for ring-shaped 
helicases El (ref. 23) and Rho™, where all or some of the subunits 
coordinate their chemo-mechanical activities (Fig. 3d). Coordination 
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Figure 3 | Processivity dependence on nucleotides and a proposed 
coordinated model. a, An example of unwinding with ATP to illustrate the 
method of determining distance between slips. b, c, Measured processivity 
(mean distance between slipping events) as a function of [ATP] alone, and as 
functions of [dTTP] at two fixed concentrations of ATP. Note processivity 
increased substantially when a small amount of dTTP was added to the 
reaction. Solid lines are global fits using the coordinated model, yielding 

n= 5.2 + 0.4. For comparison, fits using m = 2 are also shown. Error bars 
indicate s.e.m. d, An interpretation of the proposed coordinated model. Each 
subunit is uniquely labelled with a different colour and has a potential 
ssDNA-binding site (small red dot). Nucleotide binding and subsequent 
hydrolysis occur sequentially around the ring. If a subunit is nucleotide-ligated 
(the state of hydrolysis indicated by Ni), it has a non-zero probability of being 
bound to ssDNA. During unwinding, the leading subunit can bind to a 
nucleotide (N) and thus acquire affinity to the upstream ssDNA. This 
stimulates the last nucleotide-bound subunit to release its nucleotide and 
ssDNA. Then the cycle proceeds again around the ring. Slippage occurs when 
all subunits simultaneously release ssDNA, as determined by the joint 
probability of detachment for all subunits (Supplementary Discussion). 
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could occur sequentially around the hexameric ring with the leading 
subunit poised for NTP binding and each successive subunit having a 
bound nucleotide in states of progression along the chemical reaction 
pathway (NTP, NDP + Pi, NDP, and so on). Depending on the state 
and type of nucleotide bound each subunit may have a different affinity 
to DNA. Once the leading subunit binds to an NTP and reels in the 
DNA, the remaining subunits progress to their next reaction states. 
Product release by the last participating subunit results in release of 
DNA from that subunit, and thus completes a single cycle. 

We formulated quantitative descriptions for the uncoordinated and 
coordinated models (Supplementary Discussion). The observed rate of 
unwinding as a function of [ATP] or [dTTP] is consistent with both 
models, which predict an apparent Michaelis-Menten-like kinetics. 
The observed unwinding rate with ATP and dTTP mixtures is also 
consistent with the competitive binding kinetics for both models as 
long as, in the case of the coordinated model, the rates are treated as 
averages over time (Supplementary Discussion). Although the two 
models cannot be distinguished based on rate measurement studies, 
they do yield different predictions for DNA slippage behaviour. The 
uncoordinated model (Supplementary Discussion) requires that each 
subunit binds and hydrolyses nucleotides independently with an affinity 
to DNA dependent on the state and type of nucleotide bound. This 
model is not consistent with the processivity data taken with mixed 
nucleotides at concentrations near or lower than their respective Ky, 
values (Supplementary Fig. 9). 

On the other hand, the coordinated model requires that subunits 
participating in coordination bind and hydrolyse nucleotide in coordi- 
nation, with only one subunit poised to bind a nucleotide at a time and 
with each subunit having an affinity to DNA dependent on the state and 
type of nucleotide bound. This model predicts that processivity should 
increase linearly with [NTP] in the presence of a single type of NTP. 
Indeed, our data show that the processivity increases linearly with 
increasing [ATP] (Fig. 3a, b). If multiple helicase subunits coordinate 
in their chemo-mechanical activities, what is the degree of coordination 
as measured by the number of participating subunits at any given time 
(n)? This is a key parameter that characterizes the mechanism of the 
helicase. Previous studies indicate that only one or two subunits are 
involved in significant DNA binding, suggesting a lower degree of 
coordination of n = 1 or 2 (refs. 16, 20-22). However, subunits may 
participate in the coordination even if they have lower affinity to ssDNA. 
The coordinated model formulated (Supplementary Discussion) is 
rather general and naturally takes this into account. Interestingly, it 
predicts that processivity sensitively depends on n as [dTTP] is 
increased in the presence of a fixed [ATP]—the larger n, the more 
subunits participate in DNA binding, and the more steeply processivity 
increases with [dTTP]. Therefore we measured processivity with mix- 
tures of ATP and dTTP (Fig. 3c). A global fit to the processivity data in 
Fig. 3b, c yielded n = 5.2 + 0.4 (Methods Summary). In contrast, n = 2 
does not agree with the measurements. These findings are further sub- 
stantiated by experiments using UTP instead of ATP (Supplementary 
Fig. 10, n = 5.0 + 0.3), experiments under a different unzipping force 
(Supplementary Fig. 11, n = 5.4 + 0.3), and data on time between slips 
(Supplementary Fig. 12, n = 5.5 + 0.4). Because n = 6 is expected for a 
hexamer, this finding indicates that nearly all subunits participate in the 
coordination (1 = 5 or 6) (Fig. 3d). Our findings suggest that only one 
subunit at a time can accept an incoming nucleotide, while the rest of the 
subunits are already nucleotide bound and coordinate to prevent slip- 
page and maintain high processivity. 

The work presented here provides a quantitative description of 
nucleotide binding/hydrolysis and its coupling to DNA binding and 
translocation for T7 helicase. This was possible because unwinding 
and slippage events are clearly distinguishable in single-molecule 
traces. The slippage behaviour is explained by a multiple-site coordi- 
nated model. For helicase to slip, all six subunits must simultaneously 
lose their grip on the DNA. This happens more often when helicase 
subunits are bound only to ribose nucleotides. Our data demonstrate 
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that T7 helicase has a very weak DNA binding affinity in the presence 
of ATP but the addition of a small amount of dTTP to the ATP 
reaction increases the binding affinity of helicase to DNA. As a con- 
sequence, the presence of a single deoxythymidine-ligated subunit 
significantly decreases the chance of slippage so that helicase can still 
effectively unwind dsDNA with ATP. Thus T7 helicase, like most other 
helicases’, could still use ATP as a main power source in vivo, under 
conditions such as those during phage infection of E. coli’* where ATP 
is most abundant. ATP could be used for rapid unwinding and dTTP 
for high processivity. Although we focus here on a comparison of 
dTTP with ATP for helicase unwinding, other deoxyribose nucleotides 
may also reduce the frequency of slippage (Supplementary Fig. 3). We 
speculate that slippage may also provide an evolutionary advantage for 
replication: when dNTP concentrations are low, slippage can slow 
down helicase to allow its synchronization with a slow-moving DNA 
polymerase. 


METHODS SUMMARY 


Single-molecule assays were performed as described previously’. If dTTP and ATP 


compete for binding to helicase according to the kinetic pathway outlined in Fig. 2d, 
arp [ATP] arrp [AT TP] 

V, / 
max "KATP "max K@rre 

) , where for each type of nucleotide Ky = Kath: and 


1 


then the resulting unwinding rate is: Viot 

[ATP] [dTTP] 
(+ 0 
Vinax = Sky with s being the step size (in nucleotides) (see Supplementary 
Discussion). In the presence of dTTP and ATP, if helicase subunits coordinate 
in their chemo-mechanical activities and DNA binding, then the resulting distance 


yATP [ATP ] yatTP [dTTP :) / 
max KAT max Kerr 


between slips (processivity) is: dprocessivity ( 
[ate] / Kl” 
[ATP] /KQ? + [dTTP] /KeTP 
proportionality constant. This expression was used to fit data in Fig. 3b, c with c 
and n as fit parameters. 


n—1 
(Supplementary Discussion), with c being a 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Protein and DNA preparations. Wild-type T7 helicase (gp4A') and Y535F 4A’ 
were expressed and purified as described previously'’. A 5.2kb DNA was con- 
structed as described elsewhere””’, with minor modifications. Briefly, a ~1.1 kb 
anchoring segment was prepared by PCR from pRL574 using a diogoxigenin- 
labelled primer, and then digested with BstXI (NEB) to produce a 3 bp overhang. 
A ~4.1kb unzipping/translocation/unwinding segment was derived from 
pCP681 by digestion with Earl (NEB) and ligated to a biotin-labelled 37 bp seg- 
ment lacking a 5’ phosphate on the distal end. The anchoring segment and 
unzipping segment were then ligated, with a nick due to the missing phosphate. 
For ssDNA translocation experiments (Supplementary Fig. 8), the ~4.1 kb seg- 
ment was capped with a hairpin (5'-TAGGGCGACCTAGCTCTATGCTAGG 
TCGCC-3’). 

Single-molecule assays. Sample preparation was similar to that previously 
described”. Briefly, helicase was prepared by first incubating 2 1M of the helicase 
monomer for 20 min in the unwinding buffer. This solution was then further 
diluted to obtain the final experimental concentration of helicase monomer, nucleo- 
tides and MgCl,. DNA tethers were formed by first non-specifically coating the 
sample chamber surface with anti-digoxigenin (Roche), followed by an incubation 
with digoxigenin tagged DNA. Streptavidin-coated 0.48 um polystyrene micro- 
spheres were then added to the chamber. Finally, helicase solution was flowed in 
just before data acquisition. The helicase unwinding buffer was 20 mM Tris-HCl 
(pH 7.5), 3mM EDTA, 0.02% Tween 20, 50mM NaCl, NTPs or dNTPs at the 
concentrations specified in the text, and MgCl, at a concentration 5 mM in excess of 
the total nucleotide concentration (Supplementary Fig. 13). The helicase monomer 
concentration was adjusted between 1-500 nM for each buffer condition so that the 
average unwinding initiation time (defined as the time between when the DNA was 
initially mechanically unzipped and when the helicase began to unwind) was 
approximately the same for all experiments (Supplementary Fig. 3). 

Experiments were conducted in a climate-controlled room at a temperature of 
23.3 °C, but owing to local laser trap heating the temperature increased slightly to 
25+1°C (ref. 26). Each experiment was conducted in the following steps 
(Supplementary Fig. 1). First, several hundred base pairs of dsDNA were mech- 
anically unzipped, at a constant velocity of 1,400bps ', to produce a ssDNA 
loading region for helicase. Second, after the force dropped owing to helicase 
loading and initiation of unwinding, several hundred more base pairs were mech- 
anically unzipped to generate ssDNA for helicase translocation. Third, the fork 
position was maintained until the force dropped again, indicating that the helicase 
had again reached the junction, at which point the force was allowed to drop to 
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8 pN and then maintained at this level as helicase unwound the remaining ~3 kb 
of dsDNA. Measurements of ssDNA translocation rates and dsDNA unwinding 
rates by T7 helicase were thus obtained for each tether. 

Data collection and analysis. Data were low-pass filtered to 5 kHz and digitized at 
12kHz, then were further averaged to 110 Hz. The acquired data signals were 
converted into unwound base pairs as previously described””’. To improve posi- 
tional accuracy and precision, the data were then aligned to a theoretical unzipping 
curve for the mechanically unzipped section of the DNA”. Slippage events were 
identified by a threshold on the instantaneous unwinding rate at each sequence 
position (Supplementary Fig. 4). We used a threshold of 2,000 bp s_' in the reverse 
velocity for identifying slippage. Unwinding rates from each trace were found from 
linear fits to the unwinding between adjacent slippage events. An average unwind- 
ing rate was obtained from a number of traces. Distances travelled between slips 
were compiled to determine processivity. These distances followed an exponential 
distribution, indicating a stochastic process in slippage’*. Processivity is defined as 
the mean distance of the distribution (Supplementary Fig. 4b). 

Modeling. If dTTP and ATP compete for binding to helicase according to the 
kinetic pathway outlined in Fig. 2d, then the resulting unwinding rate is: 


ATP dTTP ATP dTTP 
Veot (va ad +yar | am) / (1 + aud | a): where for each 
KA Ka Kar "Ka 


type of nucleotide Ky = Rovths and Vinax = sky with s being the step size (in 


nucleotides) (see Supplementary Discussion). In the presence of dTTP and 
ATP, if 1 helicase subunits coordinate in their chemo-mechanical activities and 
DNA binding, then the resulting distance between slips (processivity) is: 
es =<(van [ATP] | yar ) ( [ATP]/Ky" 
pres C\ Vina gre + Von “arte /\FATP)/KA + [aTTP|/KE™ 
(Supplementary Discussion), with c being a proportionality constant. This 
expression was used to fit data in Fig. 3b, c with c and » as fit parameters. 


25. Koch, S. J., Shundrovsky, A., Jantzen, B. C. & Wang, M. D. Probing protein-DNA 
interactions by unzipping a single DNA double helix. Biophys. J. 83, 1098-1105 
(2002). 

26. Peterman, E. J., Gittes, F. & Schmidt, C. F. Laser-induced heating in optical traps. 
Biophys. J. 84, 1308-1316 (2003). 

27. Shundrovsky, A., Smith, C.L., Lis, J. T., Peterson, C. L. & Wang, M. D. Probing SWI/ 
SNF remodeling of the nucleosome by unzipping single DNA molecules. Nature 
Struct. Mol. Biol. 13, 549-554 (2006). 

28. Lohman, T. M., Tomko, E. J. & Wu, C. G. Non-hexameric DNA helicases and 
translocases: mechanisms and regulation. Nature Rev. Mol. Cell Biol. 9, 391-401 
(2008). 


©2011 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


doi:10.1038/nature10423 


Deep sequencing reveals 50 novel genes 
for recessive cognitive disorders 
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Common diseases are often complex because they are genetically heterogeneous, with many different genetic defects 
giving rise to clinically indistinguishable phenotypes. This has been amply documented for early-onset cognitive 
impairment, or intellectual disability, one of the most complex disorders known and a very important health care 
problem worldwide. More than 90 different gene defects have been identified for X-chromosome-linked intellectual 
disability alone, but research into the more frequent autosomal forms of intellectual disability is still in its infancy. To 
expedite the molecular elucidation of autosomal-recessive intellectual disability, we have now performed homozygosity 
mapping, exon enrichment and next-generation sequencing in 136 consanguineous families with autosomal-recessive 
intellectual disability from Iran and elsewhere. This study, the largest published so far, has revealed additional mutations 
in 23 genes previously implicated in intellectual disability or related neurological disorders, as well as single, probably 
disease-causing variants in 50 novel candidate genes. Proteins encoded by several of these genes interact directly with 
products of known intellectual disability genes, and many are involved in fundamental cellular processes such as 
transcription and translation, cell-cycle control, energy metabolism and fatty-acid synthesis, which seem to be 


pivotal for normal brain development and function. 


Early-onset cognitive impairment, or intellectual disability, is an 
unresolved health care problem and an enormous socio-economic 
burden. Most severe forms of intellectual disability are due to 
chromosomal abnormalities or defects in specific genes. For many 
years, research into the genetic causes of intellectual disability and 
related disorders has focused on X-chromosome-linked intellectual 
disability (XLID). It has become clear, however, that X-linked forms 
account for only 10% of intellectual disability cases, which means that 
the vast majority of the underlying genetic defects must be autoso- 
mal’. For severe forms of intellectual disability, autosomal-dominant 
inheritance is rare because most affected individuals do not repro- 
duce, but recent observations suggest that in outbred Caucasian popu- 
lations, a significant portion of the sporadic cases may be due to 
dominant de novo mutations”. So far, relatively little is known about 
the role of autosomal recessive intellectual disability (ARID), because 
in Western societies, where most of the research takes place, its investi- 
gation has been hampered by infrequent parental consanguinity and 
small family sizes. 

In most Northern African countries, and also in the Near and 
Middle East, parental consanguinity and large families are common; 
for example, in Iran, 40% of the families are consanguineous and 
about two-thirds of the population is 30 years of age or younger. 


Since 2004, we have performed systematic array-based consanguinity 
mapping in 272 consanguineous Iranian families. In several dozen 
families, we have defined single linkage intervals and mapped the 
underlying gene defects*°, and by subsequent mutation screening of 
candidate genes from these intervals, we and others identified several 
novel ARID genes (for review see refs 1, 7). 

Recently, exome enrichment and next-generation sequencing have 
been introduced as a cost-effective and fast strategy for comprehensive 
mutation screening and disease-gene identification in the coding por- 
tion of the human genome*”®. To unravel the molecular basis of ARID 
in a systematic fashion, we have now used a related, but more targeted, 
approach. Instead of sequencing entire exomes in consanguineous 
families, we have focused on the exons from homozygous linkage 
intervals known to carry the genetic defect. Before sequencing, these 
exons were enriched by hybrid capture using custom-made oligonu- 
cleotide arrays as baits. All patients had cognitive impairment (mostly 
moderate or severe, see Supplementary Table 1), and in a subset of the 
families there were signs of autism spectrum disorder. More informa- 
tion about the families and their clinical features, quality controls 
performed to validate the sequence variants observed and to assess 
their pathogenicity, as well as other methodological details are pro- 
vided in Supplementary Information. 
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Mutations in known and novel intellectual disability genes 


In 115 out of 136 families studied, plausible causal defects were 
observed, and in 78 of these, a single, apparently disease-causing muta- 
tion could be identified (see Supplementary Fig. 1, Tables 1 and 2 and 
Supplementary Table 2). Twenty-eight protein-truncating changes 
were found, including frameshift, splice-site and nonsense mutations, 
as well as whole-exon deletions, plus several smaller in-frame deletions 
of varying size. In 26 families listed in Table 1, we identified known, 
mostly syndromic forms of ARID, including rare metabolic defects and 
storage disorders, such as an atypical form of Tay-Sachs’ disease and 
Sanfilippo’s syndrome (mucopolysaccharidosis IIb), as well as in- 
tellectual disability with congenital abnormalities, such as a Joubert- 
like syndrome resulting from AHII mutations, observed in two 
unrelated families. Two families were also found with allelic PRKCG 
mutations, implicated previously in spinocerebellar ataxia, and two 
families carried different allelic mutations in the SRD5A3 gene, assoc- 
iated with Kahrizi’s syndrome, a recently elucidated congenital glyco- 
sylation disorder’*””. 

Two mutations involving the adaptor protein complex 4 were 
observed, namely in the AP4M1 and AP4EI genes, which encode 
different AP-4 subunits. AP-4 is involved in the recognition and 
sorting of cargo protein transported from the trans-Golgi network 
to the endosomal-lysosomal system. Another possibly pathogenic 
change was found in the AP4B1 gene, but its effect may be obscured 
by a PEX6 mutation in the same family, which causes a severe peroxi- 
some biosynthesis disorder!’ and probably accounts for most of the 
clinical features. In highly inbred families, coexistence of two different 
recessive defects is not unexpected and is the most plausible explana- 
tion for the complex phenotypes in at least two families with novel 
forms of ARID (M154 and M189, see Table 2). 

Mutations in the SLC2A1 gene, which encodes a glucose trans- 
porter, the PRKRA gene with a role in dysautonomia, and the 
MED13L gene, previously associated with intellectual disability and 
cardiac symptoms, were the only plausible causes of intellectual dis- 
ability in three families with non-syndromic intellectual disability. 
None of the respective families showed signs of dysautonomia or 
cardiac abnormalities. In all other families, the phenotype was char- 
acteristic for the molecular defect, including family M198 with folate 
receptor deficiency, a rare syndromic form of ARID that can often be 


treated by oral administration of folinic acid'*. Further details are 
provided in Table 1. 

Apparently pathogenic changes were also found in 50 genes that 
had not been previously implicated in ARID (see Table 2). Thirty of 
the relevant families had non-syndromic forms of intellectual disabil- 
ity, whereas 22 exhibited syndromic forms. Only two of the novel 
ARID genes were mutated in more than a single family. Two different 
missense mutations with high pathogenicity scores were detected in 
ZNF526, which encodes a kriippel-type zinc-finger protein. One of 
these changes was observed in DNA samples collected from two 
distinct families with non-syndromic intellectual disability, but closer 
inspection revealed that these families, which live in the same city in 
the northwestern part of Iran, share a common haplotype and thus 
must be distantly related. In these families, no other potentially 
disease-causing and co-segregating change could be identified. Zinc- 
finger proteins are transcriptional regulators, and other kritppel-type 
zinc-finger genes have been implicated in intellectual disability 
before’. Recent protein interaction studies have indicated a role for 
ZNF526 in promoting messenger RNA translation and cell growth 
(N. Hubner et al., personal communication). Another gene within 
which disease-causing mutations were found in two families was 
ELP2. It encodes a subunit of the RNA polymerase II elongator com- 
plex, which is a histone acetyltransferase component of RNA poly- 
merase II. This gene is involved in the acetylation of histones H3 and 
probably H4, and it may have a role in chromatin remodelling. 


Mutations affecting housekeeping genes 


In the LARP7 gene, we found a frameshift mutation in a family with 
intellectual disability and microcephaly. LARP7 is a negative tran- 
scriptional regulator of polymerase II genes, acting by means of the 
7SK RNP system. Within the 7SK RNP complex, the positive tran- 
scription elongation factor b (P-TEFb) is sequestered in an inactive 
form, preventing RNA polymerase II phosphorylation and sub- 
sequent transcriptional elongation. Hitherto, no disease association 
has been reported for LARP7. 

Presumably causative homozygous mutations were also found in 
KDM5SA and KDM6B. These genes encode histone demethylases that 
specifically demethylate histone H3 at lysine 4 and lysine 27, respect- 
ively, and they both have a central role in the histone code. We have 


Table 1 | Mutations identified in known genes for intellectual disability or related disorders 


Family Gene Mutation LOD score Length (Mb) OMIM no. Diagnosis, clinical features 
8500306 AHI1 R329X 2.65 10.35 608629 Joubert’s syndrome 3 
M332 AHI1 R495H 3.2 1.1 608629 Joubert’s syndrome 3 
M254 AP4E1 V454fs 2.5 357 607244 Microcephaly, paraplegia 
Mo04 AP4M1 E193K 1.9 6.75 602296 Microcephaly, paraplegia 
M324 BBS7 533del2aa 3.24 8.2 209900 Bardet-Biedl’s syndrome 
M107 CA8 R2370 2.4 4.02 613227 Ataxia, cerebellar hypoplasia 
M175 COL18A1 L1587fs 2. 9.8 267750 Knobloch’s syndrome (eye and brain development) 
G026 FAM126A Splice site* 24 5.46 610532 Hypomyelination-cataract 
M198 FOLR1 Splice site* 2, 6.95 136430 Folate receptor deficiency 
M165 HEXA C58Y 27 5.91 272800 Psychomotor delay, mild Tay-Sachs’ disease 
8600276+ L2HGDH R335X 5. 3.39 609584 Hydroxyglutaric aciduria 
M142 MED13L R1416H 1.9 9.17 608808 on-syndromic ID, no cardiac involvement 
8600486 NAGLU R565Q 28 3.25 252920 Sanfilippo’s syndrome, MPS IIIB 
8500234 PDHX R15H 3.13 35.17 245349 Pyruvate dehydrogenase defect 
331 PEX6 L534P 3.8 0.83 601498 Peroxisome biogenesis disorder 
8307998 PMM2 Y106F 2.67 6.71 212065 Glycosylation disorder CDG la 
8600273 PRKCG V177fs 253 0.72 605361 Spinocerebellar ataxia 14 
146 PRKCG D480Y 2A 745 605361 Spinocerebellar ataxia 14 
8600162 PRKRA S235T 2.1 40.02 612067 on-syndromic ID 
8600042 SLC2A1 V237M 3.73 16.7 606777 on-syndromic ID 
8700017 SRD5A3 Y169C 48 10.5 612713 Kahrizi’s syndrome, CDG 
069+ SRD5A3 A68fs 3.01 10.44 612713 Kahrizi’s syndrome, CDG 
GO008 SURF 1 W227R 18 4.59 185620 Leigh’s syndrome, very mild form 
8600041 TH R202H 21 7.23 605407 infantile parkinsonism, Segawa’s syndrome 
017 VRK1 R133C 3.4 3 607596 Pontocerebellar hypoplasia 
196 WDR62 G705G 2.1 18.33 600176 Microcephaly, cerebellar atrophy 
CDG, congenital disorder of glycosylation; fs, frameshift; ID, intellectual disability; LOD, logarithm of the odds; MPS, mucopolysaccharidosis; OMIM, Online Mendelian Inheritance in Man. 
* See Supplementary Information for further details. 
+ Remotely related, degree of consanguinity is not clear, analysis performed under conservative assumption of second degree consanguinity. 
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Table 2 | Apparently causative variants in novel (candidate) genes for intellectual disability 


Family Phenotype Gene Mutation LOD score Length (Mb) Supporting evidence 

008+ $ ACBD6 G22fs 2.65 6.46 P; binds long-chain acyl-CoA molecules, role in fatty acid synthesis or turnover”’, 

173 S,ASD ADK H324R 5.1 9.68 S, P; only change in family. Adenosine kinase, regulates adenosine levels in the brain. 
Overexpression leads to learning impairment in mice*®; knockout mice develop lethal 
neonatal liver steatosis®°. In human, a different gene defect has been found in this 
condition. 

266-2 S ADRA2B R440G 2.53 24.97 S, P; GPCR regulating adrenergic neurons in the CNS. Associates with EIF2B, a GEF 
regulating translation?®. Also associates with 14-3-3, which interacts with RGS7, mutated 
in family 8700136. 

226 S ASCC3 $1564P 32 62.80 S, P, E; helicase that is part of the activating signal co-integrator complex, enhances NF-KB 
and AP1. Interacts with RARS2, implicated in pontocerebellar hypoplasia 6°. 

007Lt S ASCL1 A41S 24 18.13 Encodes the bHLH factor MASH1, critical role in neuronal commitment and 
differentiation*”4°, 

182 S C11lorf46 R236H 2.1 12.39 P,E; encodes subunit of the Triple T complex, role in regulation of DNA damage response*?. 

GOO1 S C12orf57 M1V 25 11.19 S; function hitherto unknown. May overlap neighbouring ANT1 (DRPLA) gene (see UCSC 
Genome Browser, hg18; OMIM 125370). 

100 S C8ort41 P367L 3.3 6.44 S, P, E; C8orf41 associates with RUVBL2*9, which is involved in regulation of transcription 
and interacts with HDACs°°. 

GO15 S C90rf86 A562P 3.3 2.17 P; encodes Rab-like GTP-binding protein PARF, which interacts with ARF (or CDKN2A). 
Other Rab has been implicated in ID4°. 

8500031 S CACNA1G $1346fs 24 18.76 P, E; encodes a low-voltage-activated calcium channel which may also modulate the firing 
patterns of neurons?34, 

8600057 S$ CAPN10 138ins5aa 2.1 2.09 E; calcium-regulated non-lysosomal endopeptidase with a role in cytoskeletal remodelling 
and signal transduction, involved in long-term potentiation®?. 

8600495 S CASP2 Q392X 25 29.62 P; caspase 2, role in apoptosis, abnormal in CASP2-deficient mice, particularly for motor 
and sympathetic neurons°*. Motor abnormalities not observed in family. 

346 S CCNA2 Splice site* 33 52.17 S, P; cyclin A2 is essential for cell cycle control°°. In mice, targeted deletion of this gene is 
lethal*. Regulated by SCAPER, mutated in family 8600277. 

8500235t S$ CNKSR1 T282fs 2.53 15.83 P; regulates Raf in the MAPK pathway, acts as scaffold protein linking Ras and Rho signal 
transduction pathways”°. Interacts with RALGDS, which is mutated in family 8500155. 

144 S COQ5 G118S 18 15.10 P, E; methyltransferase with pivotal role in coenzyme Q biosynthesis. Interacts with NAB2 
which controls length of poly(A) tail (see http://thebiogrid.org/35094/summary/ 
saccharomyces-cerevisiae/coq5.html). The human orthologue of NAB2 is implicated in 
ARID*. 

178 S EEF1B2 Splice site* 2.6 13.84 S,P,E; controls translation by transferring aminoacyl-tRNAs to the ribosome. Interacts with 
UNC51-like kinase 2 which is involved in axonal elongation translation®°°. 

GO17 S ELP2 T555P 24 14.33 P, E; encodes subunit of the RNA polymerase II elongator complex°®. ELP3 subunit 
implicated in motor neuron degeneration. Allelic ELP2 mutation found in family 
M8s500061. 

8500061 S ELP2 R462L 2.7 16.98 P, E; involved in transcriptional elongation, see also family GO1 7 with allelic ELP2 mutation. 

M263 S ENTPD1 Y65C 2.65 12.12 P, E; ectonucleoside triphosphate diphosphohydrolase, expressed in CNS; knockout mice 
display abnormal synaptic transmitter release?’”. 

MO50+ S ERLIN2 R36K 3.73 12.72 S, P,E; involved in the ER-associated degradation of inositol 1,4,5-triphosphate 
receptors°®. 

8500058 S FASN R1819W 3.3 4.50 P; gene product synthesizes long-chain fatty acids from acetyl-CoA and malonyl-CoA. 
Expressed in post-synaptic density. In mice, FASN deficiency leads to embryonic 
lethality®?. 

M269 S FRY R1197X 2.8 12.68 P; regulates actin cytoskeleton, limits dendritic branching. In HeLa cells, FRY binds to 
microtubules and localizes on the spindle and is crucial for the alignment of mitotic 
chromosomes®°. 

M251 S GON4L Splice site* 3.01 40.19 PLE: cloned from brain. Encodes a transcription factor thought to function in cell cycle 
control?*. 

M189¢ S HIST1H4B K9fs 2.1 48.87 P, E; encodes a member of the histone H4 family; analogy to histone H3 mutation in family 
GO02. Ehlers—Danlos-related symptoms are probably due to TNXB mutation. 

GO02 NS HIST3H3 R130C 2.53 26.74 P; role in spindle assembly and chromosome bi-orientation®***. See also family M189 
with HIST1H4B mutation. 

8500064 NS INPP4A D915fs 24 46.16 P, E; encodes inositol polyphosphate-4-phosphatase, only plausible change in family. 
Regulates localization of synaptic NMDA receptors, protects neurons from excitotoxic cell 
death®°. Knockout mice develop locomotor instability; not observed in this family. 

M061 s KDM5A R719G 23 6.06 P, E; encodes histone demethylase specific for Lys 4 of histone H3, role in transcriptional 


regulation®®. Other histone demethylase has been implicated in X-linked ID!®. See also 
family M8303971 with KDM6B mutation. 


previously shown that mutations in another lysine-specific histone 
demethylase, KDM5C (also called JARID1C), are a relatively frequent 
cause of X-linked intellectual disability’®. In two other families, we 
observed apparently pathogenic mutations that involved histones 
directly: a frameshift mutation in the HIST1H4B gene which belongs 
to the histone 4 family, and a HIST3H3 missense mutation with high 
pathogenicity scores that was the only plausible change in a family 
with non-syndromic intellectual disability. Together, at least ten of the 
novel candidate genes for ARID involve histone structure, histone 
modification, chromatin remodelling or the regulation of transcrip- 
tion, and many of these genes are functionally linked to known and 
novel intellectual disability genes, as shown in Fig. la. 

Several other mutated genes are directly or indirectly involved in 
the regulation of translation. A homozygous frameshift mutation 


inactivating the TRMTI gene was detected in a family with non- 
syndromic intellectual disability. TRMT1 is an RNA methyltrans- 
ferase that dimethylates a single guanine residue at position 26 of most 
tRNAs. Previously we and others have shown that inactivation of the 
X-linked gene FTSJ1, another RNA methyltransferase, also gives rise to 
non-syndromic intellectual disability'”’*, and we have recently iden- 
tified several ARID families with truncating mutations in a third RNA 
methyltransferase (L.A.M. et al., manuscript in preparation). A large 
deletion in the EEF1B2 gene was the only detectable defect in another 
family with non-syndromic intellectual disability. EEF1B2 encodes the 
elongation factor 1B, which is involved in the transport of aminoacyl- 
tRNAs to the ribosomes. In yet another family with non-syndromic 
intellectual disability, a missense change was found in ADRA2B. This 
gene encodes a brain-expressed G-protein-coupled receptor that 
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Table 2 | Continued 


Family Phenotype Gene Mutation LOD score Length (Mb) Supporting evidence 

8303971 S$ KDM6B P888S 3.1 5.08 S, P; demethylase 6B specifically targeting Lys 27 of histone H3, has a central role in 
regulation of posterior development by regulating HOX gene expression®’. Mutation of 
KDMSA gives rise to ID (see family M061). 

M154 S KIF7 E758K 21 746 P, E; knockout mouse model with complex picture involving brain and other neurological 
abnormalities®. Stickler-like clinical features in this family can be explained by co-existing 
COL9A1 mutation. 

M183 S$ LAMA1 G1572fs 2.1 5.82 S, P; codes for subunit of laminin, role in attachment, migration and organization of cells 
during embryonic development. Required for normal retinal development in mice®?. 

G030 S LARP7 K276fs 1.93 8.94 S,P; encodes negative transcriptional regulator of polymerase II genes”°. 

7903104 § LINS1 H329fs 2.65 7.87 S, P; similar to /in, a Drosophila gene having important roles in the development of the 
epidermis and the hindgut. Link with ID unclear. 

8600060+ NS MAN1B1 R334C 3.13 2.49 P, E; encodes mannosidase that targets misfolded glycoproteins for degradation. MAN1B1 
frameshift mutation observed in another ARID family by Canadian group VJ. Vincent, 
personal communication). 

8600277 NS NDST1 R709Q 2.1 10.18 S, P; only change in family. Encodes heparan N-deacetylase/N-sulphotransferase, 
deficiency is lethal in mice due to respiratory distress’. No obvious link with ID. 

M158 $ PARP1 L293F 18 16.76 P; poly(ADP-ribose) polymerase involved in histone 1 modification; role in memory 
stabilization in mice’?. 

M194 NS,ASD  PECR L57V 2.5 Li27 P; brain-expressed peroxisomal trans-2-enoyl-CoA reductase involved in the biosynthesis 
of unsaturated fatty acids’°. 

8401214 S$ POLR3B T199K 1.93 24.89 E; second-largest core component of RNA polymerase Ill, which synthesizes small RNAs 
such as tRNAs and 5S rRNAs*?. 

8500302 NS PRMT10 G189R 2.65 9.75 P, E; protein arginine methyltransferase 10. Protein arginine methylation affects 
chromatin remodelling leading to transcriptional regulation, RNA processing, DNA repair 
and cell signalling”. 

M010 NS PRRT2 A214fs 5.2 25.59 P; interacts with SNAP25 which in turn assembles with syntaxin-1 and synaptobrevin to 
form exocytotic fusion complex in neurons°?. 

8500155 NS RALGDS A706V 4.0 5.56 S, E; effector of Ras-related RalA and RalB GTPases, role in synaptic plasticity”®. Interacts 
with CNKSR1, inactivated in family 8500235. 

8700136 NS,ASD RGS7 N304fs 2.53 24.34 P; regulator of G protein signalling. Interacts with 14-3-3 protein, tau and snapin, a 
component of the SNARE complex required for synaptic vesicle docking and fusion’°. 

ndirectly linked with ADRA2B, mutated in family M266_2. 

8600086 NS SCAPER Y118fs 39 7A5 S, E; interacts with CCNA2/CDK2 complex, transiently maintains CCNA2 in cytoplasm’°®. 
CCNA2 is mutated in family M346. 

8600012 S$ SLC31A1 R90G 21 3.85 P, E; encodes one of two genes involved in copper import. Deficiency of the SLC31A1 
orthologue in mice is early lethal, heterozygotes have progressive neurological disorder’’, 
similar to patients in this family. 

M177 S TAF2 W649R 21 9.16 P, E; TATA-box-associated gene is very important regulator of transcription (see OMIM 
604912). Other TAF genes have been implicated in X-linked ID (V.K. et a/., manuscript in 
preparation). MAL2 is another, less likely, candidate in this family. 

M160 S TMEM135 C228S 24 6.89 S, P, E; ttansmembrane protein involved in fat metabolism and energy expenditure’®, 

M300 NS TRMT1 I230fs 3.4 0.34 P, E; encodes dimethylguanosine tRNA methyltransferase”’. At least two other RNA 
methyltransferases have been implicated in ID (ref. 17 and LA.M., manuscript in 
preparation). 

M168 NS,ASD  UBR7 N124S 2.5 8.78 P, E; encodes n-regognin 7, a component of E3 ubiquitin ligase®°. Involved in protein 
degradation, which has been implicated in ID. 

8500320 S WDR45L R109Q 1.93 2.55 P, E; WD repeat domain, phosphoinositide-interacting protein 3, ILF1-like®*, specific 
function unknown. 

M169 S ZBTB40 Q525X 3.5 14.56 S, P, E; krppel-type zinc finger, highly expressed in brain. Regulator of glia 
differentiation?®. 

M156 NS ZCCHC8 L9OX 23 7.64 P; zinc-finger protein, identified in the spliceosome C complex. Interacts with BRCA1 and 
RBM7®2®>_ RBM10 has been implicated in X-linked ID (V.K. et a/., manuscript in 
preparation). 

M025 NS ZNF526 R459Q 45 6.13 P; zinc-finger protein, only remaining change in family. Functional relevance supported by 
3D modelling. Probable activator of mRNA translation. Allelic ZNF526 mutation observed 
in family 8500156. 

8500156 NS ZNF526 Q539H 4.04 11.33 P; see family M025 with allelic ZNF526 mutation. 


References 44-83 are listed in Supplementary Information. E, high evolutionary conservation score; P, high pathogenicity score, includes truncating mutations; S, only change found in family. ASD, autism 
spectrum disorder; GPCR, G-protein-coupled receptor; ID, intellectual disability; NS, non-syndromic; S, syndromic. 

* See Supplementary Information for further details. 
+ Parents are distantly related. LOD scores provided are minimum estimates, calculated on the assumption that they are second cousins. 
£1n ethnically matching healthy controls a single heterozygous carrier was found (for details, see Supplementary Table 3). 


associates with EIF2B, a guanine exchange factor regulating trans- 
lation’”’; notably, ADRA2B also interacts with the 14-3-3 protein, 
which in turn associates with RGS7, another novel ARID gene product 
that regulates G-protein signalling. Finally, in a family with a syndromic 
form of intellectual disability, a missense change was found in the 
POLR3B gene, involving a nucleotide with a very high conservation score 
and predicted to be pathogenic by Mutation Taster. POLR3B encodes 
the second-largest core component of RNA polymerase III, which 
synthesizes small RNAs such as tRNAs and 5S rRNAs” and also inter- 
acts with ENTPD1, the product of a novel candidate gene for intellectual 
disability (see GeneCards, http://www.genecards.org/cgi-bin/cardsearch. 
pl?search=POLR3B and Table 2). Together, these observations 
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indicate that gene defects interfering with transcription and translation 
are particularly important causes of intellectual disability. 

However, we also found pathogenic mutations affecting other fun- 
damental cellular functions and pathways such as cell-cycle control, as 
illustrated by a mutation inactivating CCNA2, and another one trun- 
cating SCAPER, a specific regulator of the CCNA2-CDK2 complex 
(see Fig. 1b). The C1 1orf46 gene encodes TTI2, a subunit of the Triple 
T complex, which is required for the establishment of cell-cycle check- 
points and for DNA-damage signalling’. Other mutations involved 
fatty-acid synthesis and turnover (ACBD6, FASN and PECR; see 
Table 2), protein degradation (UBR7), splicing (ZCCHC8) and cell 
migration (LAMA1). 
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Figure 1 | Known and novel intellectual disability genes form protein and 
regulatory networks. a, Transcriptional/translational network. b, Cell-cycle- 
related network. c, Ras/Rho/PSD95 network. Connecting edges in the figure 
stand for protein-protein interactions. Arrows define direction of post- 
translational protein modifications: a, acetylation; ar, ADP-ribosylation; d, 


Intellectual disability genes with brain-specific functions 


Not surprisingly, several mutations involved genes with neuron- or 
brain-specific functions. For example, we found a frameshift mutation 
abolishing the function of CACNA1G, a T-type calcium channel with a 
critical role in the generation of GABAg receptor-mediated spike and 
wave discharges in the thalamocortical pathway****. A nonsense muta- 
tion inactivated ZBTB40, which has a role in glia cell differentiation”’, 
and other observed changes are expected to interfere with the regu- 
lation of neurotransmission, exocytosis or neurotransmitter release. 
Our study also adds several novel intellectual-disability-associated 
genes to the Ras and Rho pathway (see Fig. 1c); for example, a convin- 
cing missense mutation in the RALGDS gene was the only variant 
detected in one family with non-syndromic intellectual disability. 
This gene encodes an effector of the Ras-related GTPase Ral, which 
stimulates the dissociation of GDP from the Ras-related RalA and RalB 
GTPases, thereby allowing GTP binding and activation of the 
GTPases”*. Regulators of small GTPases were among the first genes 
to be implicated in non-syndromic intellectual disability’”**. We also 
found a homozygous frameshift mutation in CNKSR1, which is phys- 
ically associated with RALGDS. Homozygous carriers of this mutation 
have a severe syndromic phenotype with quadrupedal gait. CNKSR1 
binds to rhophilin (Online Mendelian Inheritance in Man (OMIM) 
601031), a Rho effector, suggesting that it acts as a scaffold protein and 
mediates crosstalk between the Ras and Rho GTPase signalling path- 
ways”. Neither RALGDS nor CNKSRI1 had been implicated in intel- 
lectual disability so far; thus, both are novel ARID genes. 


Genes without obvious link to intellectual disability 


For several of the sequence variants, there is no obvious functional 
link between the molecular defect and intellectual disability. This 
applies to LINSI and NDST1, and it is not easy to understand why 
in humans, adenosine kinase deficiency should lead to intellectual 
disability, whereas in the mouse, overexpression of Adk causes neuro- 
logical symptoms, and Adk deficiency gives rise to early lethal liver 
steatosis*®. Nothing is known yet about the function of the C12orf57 
gene, apart from its apparent overlap with ATN1 (see UCSC Genome 
Browser, NCBI36/hg18). CAG trinucleotide expansion in the ATN1 
gene is the cause of dentatorubral pallidoluysian atrophy (DRPLA), 
another syndromic form of intellectual disability. A comprehensive 
list of families with single, probably disease-causing mutations is 
shown in Table 2. 


demethylation; da, deacetylation; dq, deubiquitination; m, methylation. Dotted 
lines indicate modulation of gene function. Data were obtained in part by using 
the INGENUITY software package (http://www.ingenuity.com) and by 
literature mining. More details about these proteins and their interactions are 
provided in Table 2 and in Supplementary Information. 


Despite exhaustive validation of our data and stringent filtering 
against all known neutral and pathogenic sequence variants (see 
Supplementary Information and Supplementary Tables 3-6), it is still 
possible that not all of these changes will turn out to be causative. 
Particularly for the numerous missense mutations observed, func- 
tional studies will be required to rule out rare polymorphisms that 
are unrelated to intellectual disability. In a previous study, 1% of the 
protein-truncating mutations on the X chromosome were found to be 
unrelated to disease*', and in our study, 12 observed inactivating 
mutations did not co-segregate with intellectual disability (see 
Supplementary Table 4). However, we believe that the vast majority 
of the changes presented here as probably pathogenic will be con- 
firmed, even if they have been observed only once, because most of the 
proteins encoded by these novel candidate genes interact with the 
products of known or novel genes associated with intellectual disability, 
as shown in Fig. 1. 


Most ARID genes are not synapse specific 


We have previously shown that ARID is an extremely heterogeneous 
disorder’. In contrast to non-syndromic hearing impairment or 
X-linked intellectual disability, common forms of ARID do not seem 
to exist, although there is evidence for regional clustering of the under- 
lying gene defects’. Extrapolating from the number of known 
X-chromosomal intellectual disability genes argues for the involvement 
of several hundred genes in non-syndromic ARID, and the total num- 
ber of ARID genes may well run into the thousands’. Identification of 
most or all of these genes is a prerequisite for early diagnosis, prevention 
and, eventually, therapy of intellectual disability, but at the present pace, 
many years would be required to accomplish this task. Here, we have 
combined homozygosity mapping, targeted exon enrichment and next- 
generation sequencing to speed up the molecular elucidation of ARID. 
In 78 out of 136 consanguineous families investigated, we have found 
apparently pathogenic mutations in single genes. Fifty of these genes 
had not been implicated in ARID before, and only two of these novel 
intellectual disability genes were found to be mutated in two independ- 
ent families. None of the ~10 previously known genes for non- 
syndromic ARID, including those that were identified in Iranian 
families’ °°, was observed in our present cohort, thereby corroborating 
previous evidence that ARID is extremely heterogeneous. 

Much of the research into the molecular causes of intellectual dis- 
ability has focused on the synapse and synapse-specific genes (for 
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example, see refs 2, 37). In the present study, relatively few of the novel 
defects identified involve synapse- or neuron-specific genes, and they 
are vastly outnumbered by ubiquitously expressed genes with indis- 
pensable cellular functions, such as DNA transcription and trans- 
lation, protein degradation, mRNA splicing, energy metabolism as 
well as fatty-acid synthesis and turnover. Many of these defects were 
found to be associated with non-syndromic ARID. It is not immedi- 
ately clear why the clinical consequences of defects involving such a 
wide spectrum of basic cellular processes should be confined to the 
brain, but this conceivably reflects the complexity of the central nerv- 
ous system which may render it particularly vulnerable to damage. 

We expect that these findings will have direct implications for the 
diagnosis and prevention of intellectual disability, and perhaps also 
for autism, schizophrenia and epilepsy, which often co-exist in intel- 
lectual disability patients and are frequently associated with muta- 
tions in the same genes (for example, see ref. 38; reviewed in ref. 1). 
Further investigation of the novel genes and networks presented here 
should significantly deepen our insight into the pathogenesis of intel- 
lectual disability and related disorders. Moreover, this study illustrates 
the power of large-scale next-generation sequencing in families as a 
general strategy to shed light on the aetiology of complex disorders 
and on the function of the underlying genes. 

Note added in proof: While this work was in the press, two unrelated 
groups reported on inactivating ERLIN2 mutations in patients with 
recessive intellectual disability and progressive motor dysfunction?”. 
Moreover, syndromic forms of intellectual disability have been described 
in patients with AP4B1 and AP4E1 (ref. 41) and MANIBI (ref. 42) 
mutations, respectively. Finally, mutations inactivating the KIF7 gene 
were identified as the cause of the recessive fetal hydrolethalus and 
acrocallosal syndromes that include brain malformations”. 


METHODS SUMMARY 


Most families studied were from Iran, and less than 10% had a Turkish or Arabic 
background. Wechsler Intelligence Scales for Children (WISC) and WAIS were 
used to assess the IQ in children and parents. Many of the pedigrees, as well as the 
methods used for autozygosity mapping, have been described previously. 

Exons from homozygous intervals were enriched with custom-made Agilent 
SureSelect DNA capture arrays and sequenced on an Illumina Genome Analyser 
Il yielding 76-bp single reads. >98% of the targeted exons were covered by at least 
four non-redundant sequence reads, each with a PHRED-like quality score of 20 
or above (mean, 0.984; median, 0.993; for details, see Supplementary Table 5). 

Toassess the reliability of this procedure for calling homozygous mutations, we 
looked up SNP markers from homozygous intervals of five selected families that 
had been analysed with high-resolution SNP arrays. For 773 out of 776 markers, 
next-generation sequencing and array-based SNP typing yielded identical results. 

To detect single nucleotide variants, high-quality reads were aligned to the 
human reference genome (hg18) by SOAP2.20 with default settings, typically 
gap-free. Homozygous exon-spanning deletions were assumed if the sequence 
coverage of the relevant exon(s) was reduced to <5% of the mean. Details about 
the detection of smaller deletions and insertions are provided in Methods. All 
variants were validated by high-resolution array CGH, Sanger sequencing, or 
both. 

Homozygous variants were filtered against dbSNP130/131, whole genomes 
from 185 healthy individuals studied by the 1000 Genomes Project and exomes 
from 200 Danish individuals, and found to be absent in at least 100 chromosomes 
from Iranian controls (see Supplementary Tables 1 and 3). To select and prioritize 
apparently disease-causing variants, various criteria were used (for more details, 
see Methods). All putative mutations co-segregated with intellectual disability in 
the respective families. 
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Mechanical strain in actin networks regulates FilGAP 
and integrin binding to filamin A 


A.J. Ehrlicher’?, F. Nakamura’, J. H. Hartwig’, D. A. Weitz? & T. P. Stossel! 


Mechanical stresses elicit cellular reactions mediated by chemical 
signals. Defective responses to forces underlie human medical 
disorders’ * such as cardiac failure and pulmonary injury*®. The 
actin cytoskeleton’s connectivity enables it to transmit forces 
rapidly over large distances’, implicating it in these physiological 
and pathological responses. Despite detailed knowledge of the 
cytoskeletal structure, the specific molecular switches that convert 
mechanical stimuli into chemical signals have remained elusive. 
Here we identify the actin-binding protein filamin A (FLNA)*? 
as a central mechanotransduction element of the cytoskeleton. 
We reconstituted a minimal system consisting of actin filaments, 
FLNA and two FLNA-binding partners: the cytoplasmic tail of 
B-integrin, and FilGAP. Integrins form an essential mechanical 
linkage between extracellular and intracellular environments, with 
B-integrin tails connecting to the actin cytoskeleton by binding 
directly to filamin*. FilGAP is an FLNA-binding GTPase- 
activating protein specific for RAC, which in vivo regulates cell 
spreading and bleb formation’®. Using fluorescence loss after 
photoconversion, a novel, high-speed alternative to fluorescence 
recovery after photobleaching"', we demonstrate that both externally 
imposed bulk shear and myosin-II-driven forces differentially regu- 
late the binding of these partners to FLNA. Consistent with struc- 
tural predictions, strain increases B-integrin binding to FLNA, 
whereas it causes FilGAP to dissociate from FLNA, providing a 
direct and specific molecular basis for cellular mechanotransduc- 
tion. These results identify a molecular mechanotransduction ele- 
ment within the actin cytoskeleton, revealing that mechanical strain 
of key proteins regulates the binding of signalling molecules. 

The composite cytoskeleton network in vivo provides dynamic cel- 
lular structure and actively generates movement. A physiological 
reconstituted in vitro network of actin and FLNA creates an elastic 
gel mechanically dominated by the rod-like actin filaments and cross- 
linked by flexible FLNA molecules. Applying strain to this network 
readily deforms FLNA crosslinks (Fig. 1a, b), and the specific structure 
and actin binding of FLNA suggest how these deformations might 
affect FLNA’s interaction of with some of its 90 or so other currently 
identified binding partners’. 

FLNA is an extended homodimer composed of two identical sub- 
units, each having an amino-terminal actin-binding domain followed 
by 24 immunoglobulin repeats’? (Fig. 1c, d). The actin-binding 
domains and repeats 1-15 are designated ‘rod 1’ and form a linear 
structure that binds actin filaments. Repeats 16-23, comprising ‘rod 2’, 
however, form compact globular clusters that do not interact with 
actin filaments and contain most of FLNA’s binding-partner sites. 
Strain-dependent reversible straightening of these domains contributes 
to FLNA-~actin network flexibility and may regulate local binding- 
partner affinity (Supplementary Fig. 1). Here we examine the effects 
of mechanical strain on FLNA’s interactions with two key rod-2 binding 
partners; cytoplasmic f-tail integrin, which nucleates an extensively 
characterized signalling’* and adhesion’* complex, and FilGAP, which 
isa GTPase specific for RAC, a regulator of cellular activity such as actin 


assembly’. Mechanical strain may regulate partner binding, and we 
propose that stretching FLNA crosslinks not only causes FilGAP to 
unbind, but also causes integrin to bind more strongly (Fig. 1c, d and 
Supplementary Fig. 1). Neighbouring immunoglobulin repeats cover 
integrin binding sites on FLNA repeats 19 and 21 (refs 15, 16), but 
computational simulations suggest that rod 2 of FLNA is highly flexible 
and that physiological forces are sufficient to expose these cryptic sites, 
allowing integrin to bind’”’* (Supplementary Fig 1a, b). FIGAP binding 
occurs on each repeat 23, suggesting that FilGAP is able to bind repeat 
23 on both subunits simultaneously when unstressed, providing suf- 
ficient avidity to promote FilGAP association with FLNA (Fig. 1c and 
Supplementary Fig. 1c). Mechanical stretching of FLNA spatially sepa- 
rates repeats 23, preventing FilGAP from binding simultaneously to 
both”, thus causing it to dissociate (Fig. 1d and Supplementary Fig. 1d). 

To test these hypotheses and measure the effect of mechanical stress 
on binding-partner interactions with FLNA, we reconstituted networks 
of polymerized actin (F-actin) and FLNA containing the binding- 
partner FilGAP or ,-integrin. To quantify the strain-dependent 


a b 


Network 
shear 


Mechanical 
deformation 


& 


Actin 
filaments 


Figure 1 | Differential mechanotransduction in FLNA occurs through 
spatial separation of binding sites and opening cryptic sites. a, A gel of actin 
(red) crosslinked with filamin (blue) forms an orthogonal network. b, When 
this network is strained, crosslinks are deformed. c, The actin-binding domain 
of FLNA is shown in black, and is followed by repeats 1-7 (light blue) and 8-15 
(red/orange), which form the linear rod-1 region. Repeats 16-23 (dark blue) 
form the compact rod-2 region. FilGAP (green) binds repeats 23 and the 
cytoplasmic domain of B,-integrin (purple) is unbound. d, When FLNA is 
mechanically deformed, the cryptic integrin site on repeat 21 is exposed 
allowing B,-integrin to bind, whereas repeats 23 are spatially separated, 
preventing FilGAP from binding them both. 
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kinetics of these partners of FLNA, we developed fluorescence loss after 
photoconversion (FLAC), which takes advantage of the rapid photo- 
activation or conversion of photoactivatable fluorescent proteins 
(PAFPs) asa high-speed analogue to fluorescence recovery after photo- 
bleaching". In FLAC, a sample with an initially non-fluorescent bind- 
ing partner is locally illuminated with a 50-ms pulse of 405-nm light, 
rapidly and permanently activating PAFP-conjugated partner fluor- 
escence (Supplementary Figs 4 and 5). Photoactivation fluorescently 
marks the sample faster than conventional photobleaching, and with- 
out the requirement of high excitation flux. After activation, unbound 
PAFP rapidly diffuses away, decreasing the fluorescence signal, whereas 
bound PAFP dissociates more slowly. The time-dependent decay of 
PAFP intensity reveals the kinetics of the FLNA binding partner, as 
slower decay indicates slower unbinding, and thus provides a direct, 
high-speed assay of dissociation. 

We tested the utility of these PAFP constructs in assaying binding 
kinetics by reconstituting F-actin and the PAFP-labelled binding partner 
with different forms of FLNA that have higher or lower affinities for 
B7-integrin or FilGAP. Consistent with immunoprecipitation data 
(Supplementary Fig. 3b, c), the fluorescence decay of f,-integrin 
labelled with photoactivatable green fluorescent protein (PA-GFP 
B7-integrin) was faster in wild-type FLNA networks than in the 
del41 variant (Supplementary Movie 1), demonstrating relatively 
stronger binding in the del41 mutant than in wild type. The fluor- 
escence decay of PA-GFP FilGAP was slower in wild-type FLNA net- 
works than in the M2474E mutant (Supplementary Movie 2), also in 
agreement with immunoprecipitation data (Supplementary Fig. 3a). 

We then applied the FLAC methodology to measure the mechano- 
sensitive aspect of interactions between PAFP-labelled binding partners 
and FLNA. We sheared networks of F-actin and FLNA containing 
PAFP-tagged FilGAP or B7-integrin in a precise and highly controlled 
fashion using a microscope stage comprising a stationary coverslip for 
the bottom of the sample and a piezo-controlled linear actuator for the 
top. When the FLNA/F-actin network was not strained, B7-integrin had 
a characteristic exponential decay time of 1.3 + 0.1 s. The application of 
a shear strain, y = 0.28, increased this time to 3.5 + 0.3 s (Fig. 2a). The 
change in fluorescence decay rate under strain describes how the geo- 
metric state of FLNA affects dissociation of f7-integrin; thus, mech- 
anically stretching FLNA molecules enhanced the ,-integrin binding. 
By contrast, FilGAP behaved qualitatively oppositely: unstrained net- 
works had a characteristic fluorescence decay time of 3.6 + 0.7 s, which 
decreased to 0.6 + 0.1s when a shear strain of y = 0.28 was applied 
(Fig. 2b). FLNA does not permanently crosslink actin, and by unbind- 
ing and rebinding on the timescale of ~6 min (Supplementary Fig. 6), it 
dynamically allows the network to relax to an unstressed state. After 
10min under strain, the network had sufficient time to dissipate 


internal stress through FLNA remodelling, and the fluorescence decay 
time increased to 6.1 + 0.7 s, demonstrating the reversibility of strain- 
modulated FilGAP binding to FLNA (Fig. 2b). 

The application of unidirectional shear reveals the effects of strain 
on partner binding to FLNA; however, cells commonly generate 
internal stresses using molecular motors such as myosin. To examine 
the effects of cytoskeleton-induced stress, and as a physiological tech- 
nique complementary to external shear, we included myosin II in the 
networks to generate contractile stress” (Supplementary Fig. 9 and 
Supplementary Movie 3). We allowed the composite network to 
assemble and come to an unstressed equilibrium state over ~6h, after 
which time the incorporated myosin II had ceased contracting, by 
enzymatically exhausting the pool of added ATP, and dynamic 
FLNA remodelling had dissipated internal stresses. For unstressed 
FLNA, we measured B7-integrin and FilGAP fluorescence decay times 
of 1.6 + 0.1 and 1.5 + 0.1, respectively (Fig. 3a, c). Including photo- 
labile ‘caged’ ATP in the sample allowed us to release fresh ATP and 
restart myosin motor activity’’”’, which contracts the actin network 
and strains FLNA crosslinks. In myosin-stressed FLNA, the integrin 
unbinding time increased to 2.5 + 0.2 s but the FiiGAP unbinding time 
decreased to 0.9 + 0.1s (Fig. 3b, d). The application of either external 
shear or myosin contraction resulted in increased integrin binding and 
decreased FilGAP binding, demonstrating the robust, opposite beha- 
viours of these FLNA binding partners. 

The FLNA-crosslinked actin cytoskeleton is a large, percolated net- 
work that, owing to its filamentous actin structure, can readily transmit 
large mechanical deformations over long intracellular distances, yet 
FLNA is mechanosensitive to nanometre-scale molecular deforma- 
tions. This range of length scales contrasts with that of focal adhesion 
mechanosensitivity, which detects local mechanics and is limited to 
small spatial and strain scales owing to their size and connectivity”. 

In conclusion, we have developed in vitro systems to study quanti- 
tatively protein-protein interactions under mechanical force. Using 
PAFPs with the FLAC technique provides the increases in time reso- 
lution necessary for measuring transient kinetics, without the harsh 
intensity or duration of bleaching exposure required for fluorescence 
recovery after photobleaching. The results presented here establish 
FLNA as a mechanotransductive substrate within the cytoskeleton 
and highlight the utility of in vitro systems, in combination with 
FLAC, to determine quantitative responses of specific proteins. 

Mechanotransduction in vitro provides the biological specificity 
necessary for understanding how these complex regulatory signals 
may operate in vivo. Cellular mechanotransduction has been shown 
to induce rapid biochemical activity over long distances”. Because 
mechanical stimuli induce relatively large local deformations that 
decrease in magnitude with distance from the site of application, 
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Figure 2 | External bulk shear of F-actin/FLNA networks alters FLNA’s 
binding affinity for B7-integrin and FilGAP. a, Fluorescence intensity of PA- 
GFP £,-integrin as a function of time after photoactivation. When unstrained 
(blue), fluorescence of B7-integrin decays with a characteristic time constant of 
k = 1.38. Following the application of a shear strain of y = 0.28, the time 
constant increases to 3.5s, as the integrin dissociates more slowly from FLNA 
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(n = 18). a.u., arbitrary units. b, Fluorescence intensity of PA-GFP FilGAP asa 
function of time after photoactivation. Unstrained (blue) FilGAP’s fluorescence 
decay time is k = 3.6s. A shear strain of y = 0.28 (red) decreased k to 0.6 s. This 
behaviour is reversible, and after allowing the network to relax for 10 min, 
removing all strain, k increases to 6.1 s (brown) (n = 10). 
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Figure 3 | Myosin II forces applied to F-actin/FLNA networks change 
FLNA’s binding affinity to B7-integrin and FilGAP. a, When depleted of 
ATP, myosin is in a rigor state. The FLNA within the network is not stressed 
and PA-GFP f,-integrin fluorescence decays with a characteristic time 
constant of k = 1.6s (blue). After caged ATP is released, myosin reactivates, 
straining FLNA crosslinks. The decay time constant increases to 2.5 s because 
the integrin dissociates more slowly from FLNA under stress. b, PA-mCherry 


FLNA mechanotransduction in vivo probably provides a rapid, 
distance-sensitive biphasic response by binding or unbinding integrins 
or FilGAP, respectively, as a result of the transmitted strain. Physi- 
ologically, the localization and binding of these proteins determine 
their activity. Strain-induced binding of integrin to FLNA may com- 
pete with talin binding to integrin’®, thus providing a mechano- 
sensitive switch for integrin activation and adhesion. FLNA’s homo- 
dimer structure may induce clustering of integrin, thereby reinforcing 
adhesion and concentrating signalling molecules at a specific location. 
FilGAP, when unbound from FLNA, relocates to the plasma mem- 
brane, where it inactivates RAC’. Active RAC levels profoundly affect 
cell movement’’, and increased RAC activity in FLNA-deficient cells 
correlates with increased apoptosis**. Moreover, our measurements 
are consistent with in vivo studies demonstrating that RAC activity 
and expression seem to be force-regulated by FilGAP-FLNA interac- 
tions, because inhibiting FLNA or FilGAP increases RAC activity but 
applying local forces to wild-type cells causes FilGAP to decrease RAC 
activity”*. Because FLNA does not change FilGAP’s catalytic activity, 
mechanically induced redistribution alone might explain its regulation 
in vivo. Force-dependent conformational changes in structure required 
for mechanical regulation have been observed in many proteins, includ- 
ing FLNA in vivo”. By identifying FLNA as a mechanosensitive ele- 
ment within the cytoskeleton, we have clarified how RAC and integrin 
activity may be regulated by a specific molecular mechanotransduction 
pathway. Identifying mechanotransduction elements may direct novel, 
targeted therapeutic approaches by correcting or modulating mechan- 
osensitive binding. 


alone as a control shows no significant difference between the unstrained and 
strained states. c, Fluorescence intensity of PA-GFP FilGAP as a function of 
time after photoactivation. d, In the ATP-depleted state, FilGAP’s fluorescence 
decay time is k = 1.5 s, and after release of the caged ATP (red), k decreases to 
0.9s. In PA-GFP V734Y FilGAP, a non-FLNA binding mutant used as a 
control, the decay time in the unstrained state (0.7 s) is not significantly 
different from that in the strained state (0.8 s) (n = 20). 


METHODS SUMMARY 


PAFP fluorophore synthesis. PAFP fluorophore was genetically tagged to 
binding partners, creating PAFP-labelled B,-integrin and FilGAP. Solubility and 
correct binding of labelled partners was confirmed using western blots 
(Supplementary Fig. 3). 

FLAC methodology. An external 405-nm laser was coupled into a Leica SP5 
confocal microscope and used to illuminate a central, ~2-~ym spot for 50 ms, 
converting the PAFP from its dark state to its fluorescent state (Supplementary 
Figs 4 and 5). The decay in the fluorescence intensity, I(t), of the activated fluor- 
ophores was monitored and fitted with the exponential I(t) = ae" K+ c,where kis 
the time constant of characteristic dissociation. Given k values represent best fits 
plus 95% confidence intervals. In the case of single-step uniaxial strain presented 
in Fig. 2, data were fitted to I(t) = ae "K+ 9 5e7 1/155) + ¢ to provide a more 
accurate fit and compensate for the rapid diffusion of free fluorophore. 

Sample cell composition. Shear cell samples consisted of 24 1M actin, 0.12 1M 
FLNA, X1 F-actin polymerization buffer (Methods), 2 uM Alexa 546 phalloidin 
and either PA-GFP FilGAP or PA-GFP f,-integrin, and were sheared in a piezo- 
driven shear cell (Supplementary Information). Sheared FLAC measurements for 
strained networks were acquired approximately 5-10s after shear. Myosin 
samples included 24 tM actin, 0.12 uM FLNA, 1 uM myosin II, <1 activity buffer 
(Methods) and 2M caged ATP, along with 214M Alexa 546 phalloidin and 
PA-GFP FilGAP or 21M Alexa 488 phalloidin and PA-mCherry B,-integrin. 
Samples were allowed to polymerize and consume available ATP over ~6h. 
FLAC measurements were then performed on the ATP-free unstressed network. 
Subsequently, the caged ATP (Sigma) was released by a 4-s exposure to a dif- 
fuse 50-mW, 365-nm light-emitting diode (Prizmatix), and within 3 s the network 
could be seen to homogenize under myosin contraction (Supplementary Fig. 9 and 
Supplementary Movie 3). FLAC measurements were then repeated in this active 
myosin-stressed network to quantify the strain-dependent binding activity. 
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Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Protein design and synthesis. Actin was purified from rabbit skeletal muscle and 
gel-filtered (HiLoad 16/60 Superdex 200pg; GE Healthcare) in G buffer’* (2 mM 
Tris-HCl, 0.2 mM ATP, 0.2 mM CaCl, 0.2 mM DTT and 0.005% NaN3, pH 8.0). 
Aliquots of purified G-actin were frozen in liquid nitrogen and stored at —80 °C. 
Before use, G-actin was thawed 12h in advance and dialyzed against fresh G 
buffer. Myosin II from rabbit skeletal muscle was obtained from Cytoskeleton 
(Denver). Human full-length FLNA and FilGAP were expressed using a baculo- 
virus expression system (Invitrogen) in sf9 insect cells and purified as previously 
described’. All the point or deletion mutants were generated using the 
QuickChange site-directed mutagenesis kit (Agilent Technologies). FilGAP and 
integrin constructs were expressed in sf9 cells as follows. The complementary 
DNA (cDNA) encoding PAFPs (PA-GFP and PA-mCherry, which were kind gifts 
from Jennifer Lippincott-Schwartz, NIH) were amplified by PCR using the for- 
ward primer GAAGATCTATGGTGAGCAAGGGCGAGG and the reverse pri- 
mer CGGGATCCCTTGTACAGCTCGTCCATG, and introduced into BamHI 
sites of pFASTBAC-HTb vector’? to construct pFASTBAC-HTb-PAFPs. The 
cDNA encoding the cytoplasmic domain (amino acids 769-789) of human B,- 
integrin was amplified by PCR using pET15-G4-integrin B, (a kind gift from Mark 
Ginsberg, UC San Diego) as a template with the forward primer GCGGATCCAA 
CTGGAAGCAGGACAGTAATC and the reverse primer CGGAATTCAGCG 
AGGATTGATGGTGG, and inserted into BamHI/EcoRI sites of pRFASTBAC- 
HTb-PAFPs. For FilGAP, the cDNA encoding PAFPs was introduced into 
pFASTBAC-HTa-FilGAP at the Xbal cleavage site'’. The expressed proteins were 
purified by Ni-NTA affinity and gel filtration chromatography (Superose 12 and 
Superdex 200, GE Healthcare) as previously described’. Protein concentration 
was measured by absorption at 280nm using parameters calculated with 
ProtParam (http://au.expasy.org/tools/protparam.html). Genetic fusion of 
PAFPs to the binding partners*' did not affect the binding-partner activity. 
Purified PAFP FilGAP proteins interact with full-length FLNA in the same 
dose-dependent manner as unlabelled FilGAP, and do not bind to the FLNA 
M2474E mutant, which lacks the FilGAP-binding site (Supplementary Fig. 2a). 
The PAFP-tagged cytoplasmic tail of B7-integrin was also found to retain its 
binding behaviour with FLNA, and predominantly interacts with the FLNA 
del41 variant, where the autoinhibitory ligand-binding site is constitutively 
exposed’>* (Supplementary Fig. 2b, c). 

Protein pull-down assay. The purified His-FiiGAP and His-PAFP FilGAP 
constructs were incubated with increasing amounts of FLAG-FLNA and 20 ul of 
FLAG-specific mAb M2 agarose (50% (v/v) slurry, Sigma) in binding buffer (50 mM 
Tris-HCl, 150mM NaCl, 0.1% (w/v) Triton X-100, 0.1mM f-mercaptoethanol, 
0.1mM EGTA, pH 7.4; 400 ul) for 1h at 25 °C. The beads were sedimented and 
washed three times with binding buffer. Proteins bound to the beads were solubilized 
in SDS sample buffer and separated by 9.0% (w/v) SDS-PAGE followed by immu- 
noblotting using rabbit polyclonal antibodies (pAbs) against FilGAP"”. For integrin, 
the purified His-PAFP ,-integrin (amino acids 769-789) was incubated with 
increasing amounts of wild-type and mutant (del41; amino-acid residues 2126- 
2167 are deleted) FLAG-FLNA and 40 ul of FLAG-specific mAb M2 agarose 
(50% (v/v) slurry, Sigma) in binding buffer (25 mM Tris-HCl, 50 mM NaCl, 0.1% 
(w/v) Tween 20, 1mM DTT, 10% sucrose, 5mM MgCl, 1mM EGTA, pH 7.4; 
400 ul) for 2h at 25 °C. The beads were sedimented and washed three times with 
the binding buffer. Proteins bound to the beads were solubilized in SDS sample buffer 
and separated by 10.0% (w/v) SDS-PAGE followed by immunoblotting using mouse 
mAb against His conjugated with horseradish peroxidase (Sigma). For the peptide 
pull-down assay, a synthetic peptide of the B,-integrin cytoplasmic domain 
(Cys-””"KQDSNPLYKSAITTTINPR”™”) was immobilized on Sulfo-Link agarose 
beads (1 mgml') and mixed with increasing amounts of wild-type and mutant 
(del41, AA/DK, A2272D/A2274K) FLNA. Bound FLNA was detected by immuno- 
blotting using mouse mAb to FLNA. 

Reconstitution of actin-FLNA networks. Activity buffer’? (AB; 25 mM imida- 
zole, 150mM KCl, 5mM MgATP, 0.2 mM CaCl, and 1mM DTT, pH 7.4) and 
F-actin polymerization buffer** (FB; 20 mM Tris-HCl, 2mM MgCl, 100 mM KCl, 
0.2 mM DTT, 0.2 mM CaCh, 0.5 mM ATP, pH 7.5) were formulated as established 
previously. Shear cell samples consisted of 24 1M actin, 0.12 uM FLNA, X1 EB, 
2 uM Alexa 546 phalloidin and either PA-GFP FilGAP or PA-GFP {,-integrin. 
Shear-cell design and measurements. A P-780 (Physik Instrumente) piezo- 
motor, incorporated into a home-built microscope stage, was controlled using 
LABVIEW software (National Instruments). The sample component of the micro- 
scope stage had as its lower plate a stationary glass coverslip within a stainless steel 
frame and had a glass upper plate connected by a steel post to the piezo-motor. The 
gap between the lower and upper plates was 300 jm. A lateral shear was applied as 
illustrated in Supplementary Fig. 8. Strain is defined as the change of length 
divided by the original length. Thus, a 300-,1m vertical distance that is deformed 
to 312 um when sheared laterally by 85 jum is under an engineering strain of 1.04, 
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or 4%. The parameter y is defined by the lateral shear divided by the sample 
thickness, yielding 85/300, or 0.28. Using a MATLAB-based Monte Carlo simu- 
lation of affine deformation, we calculated that a uniaxial shear strain of 0.28 
causes the angle of an isotropic distribution of FLNA crosslinks initially at 90° 
to increase and decrease their opening angles symmetrically about 90° (on respect- 
ive sides of the initially perpendicular intersection; Supplementary Fig. 8). Looking 
at the positive half of the distribution suggests that the weighted mean increase in 
opening angle is ~6.1°, with a peak of ~7.6°. 

Myosin II system. To examine the effects of cytoskeleton-induced stress, and as a 
physiological technique complementary to external shear, we included 11M 
myosin II in the networks to generate contractile stress as illustrated in 
Supplementary Fig. 9 and Supplementary Movie 3. Individual myosin II molecules 
bind their tail regions together to form minifilaments, bipolar assemblies of 8-13 
myosin molecules”!. These minifilaments allow the otherwise non-processive 
myosin II to operate collectively with an increased duty cycle, binding the 
minifilaments to actin filaments long enough for filament sliding and network 
contraction to occur. At 150mM KCl, approximately 8-13 myosin molecules 
associate into each minifilament*'. The number of myosin minifilaments per actin 
filament, Ninggi, may be calculated from Nm = [m]ngy/[a] Mme where [m] is the 
molar concentration of myosin, mg is the number of actin monomers per actin 
filament, [a] is the molar concentration of actin and m,y¢ is the number of myosin 
molecules per minifilament”. On the basis of an average actin filament length of 
~5 um and there being 13 myosin molecules per minifilament, we estimate that 
approximately six minifilaments bind per actin filament. Repeating this estimation 
with FLNA instead of myosin, we estimate that there are five FLNA crosslinks 
per actin filament, on the basis of each crosslink being composed of two FLNA 
molecules. Thus, the density of myosin minifilaments per actin filament is 
approximately equal to that of FLNA crosslinks. 

Samples were composed of 24 uM actin, 0.12 1M FLNA, 1 uM myosin II, <1 
AB and 2 mM caged ATP, along with 2 uM Alexa 546 phalloidin and 50 nM PA- 
GFP FilGAP or 2 11M Alexa 488 phalloidin and 50nM PA-mCherry ,-integrin. 

Each sample was allowed to polymerize and consume available ATP over ~6h, 

at which point myosin minifilaments had ceased contracting and were in a rigor 
state. FLAC measurements were then performed on the ATP-free unstressed 
network. Subsequently, the caged ATP (Sigma) was released by a 4-s exposure 
to a diffuse, 50-mW, 365-nm light-emitting diode (Prizmatix). Within 3s, the 
network could be seen to become active, and it substantially homogenized within 
approximately 5 min. FLAC measurements were repeated in the active myosin- 
stressed network 10-20 min after ATP release, to quantify the strain-dependent 
binding activity. 
Imaging and analysis. Fluorescence recovery after photobleaching (FRAP) pro- 
vides an effective method to measure the diffusion and binding interactions of 
fluorescent proteins''; however, the time resolution of FRAP is not adequate for 
fast transient events. To overcome this time limitation, we have developed FLAC, 
which takes advantage of the rapid photoactivation or conversion of novel 
PAFPs*' (Supplementary Figs 4 and 5). In FLAC, an initially non-fluorescent 
sample is locally pulsed with 405-nm light, activating PAFP fluorescence, and 
the time-dependent decay of the PAFP intensity reveals its diffusion and binding 
kinetics. An additional advantage is that FLAC may present more accurate bind- 
ing/diffusion information than FRAP, as adequate photobleaching requires harsh 
conditions** 

Time-dependent fluorescence images were acquired with a confocal microscope 
(TCS SP5, Leica) using a X10, 0.3 NA objective with images captured every 30- 
100 ms. PAFPs were activated using a 50-ms, ~5-mW shuttered pulse from an 
external, 405-nm laser (Bestofferbuy) coupled into the confocal microscope using 
its non-descanned X1 port. A custom-built filter cube was installed with the filter 
holder rotated azimuthally by 90° to allow the 405-nm laser light entering the X1 
port to be reflected upwards to the objective through a long-pass dichroic filter 
(Di01-R405-25x36, Semrock) and illuminate a 2-j1m spot in the centre of the 
sample x-y plane. 

Fluorescence in the photoactivation region was monitored in time and consti- 
tutes the principal data for the experiments. Data was collected before, during and 
after photoactivation and analysed in MATLAB. Fluorescence before activation 
represents the background signal, and was mainly a mixture of detection noise and 
background fluorescence. The background was defined as zero and subtracted 
from the data for each measurement. Data collected after activation was normal- 
ized to its maximum value, which was reached ~10 ms after photoactivation. 
Thus, fluorescence intensity data sets range from 1 at time zero after normalization 
and approach zero (background) at long times. Although a wide variety of fitting 
algorithms and procedures exist for FRAP, which in its analysis is mathematically 
similar to FLAC, we use a single exponential decay, I(t) = ae— wk c, to fit our data. 
These fits are accurate, allow us to differentiate data quantitatively as simply as 
possible with a single time constant, k (the characteristic fluorescence decay time), 
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and do not invoke a variety of free parameters or models with a-priori assump- 
tions". In each case, the fluorescence intensity decay is quantified and numerically 
compared using the same individual sample with and without strain to ensure 
consistency, eliminate artefacts or variability, and isolate mechanical strain as the 
single variable. In the case of single-step uniaxial strain presented in Fig. 2, data 
were fitted to I(t) =ae “*+0.5e “9 +. ¢ to provide a more accurate fit and 
compensate for the rapid diffusion of free fluorophore. 

Photobleaching decays times were measured by photoactivating PAFP nonspe- 
cifically adsorbed to the glass surface, and were found to be in excess of 800s, 
making them negligible on the timescale of FLAC experiments (<10 s), as shown 
in Supplementary Fig. 7. Photoconversion times were determined by acquiring 


fluorescence intensity images of PAFP adsorbed to glass every 3 ms during a 50- 
ms, 405-nm activation flash, as shown in Supplementary Fig. 4. 
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The instruction of the immune system to be tolerant of self, thereby 
preventing autoimmunity, is facilitated by the education of T cells 
in a specialized organ, the thymus, in which self-reactive cells are 
either eliminated or differentiated into tolerogenic Foxp3" regu- 
latory T (Treg) cells’. However, it is unknown whether T cells are 
also educated to be tolerant of foreign antigens, such as those from 
commensal bacteria, to prevent immunopathology such as inflam- 
matory bowel disease” *. Here we show that encounter with com- 
mensal microbiota results in the peripheral generation of T,.. cells 
rather than pathogenic effectors. We observed that colonic T,.g 
cells used T-cell antigen receptors (TCRs) different from those used 
by T,-g cells in other locations, implying an important role for local 
antigens in shaping the colonic T,,,-cell population. Many of the 
local antigens seemed to be derived from commensal bacteria, on 
the basis of the in vitro reactivity of common colon T,., TCRs. 
These TCRs did not facilitate thymic T,..-cell development, imply- 
ing that many colonic T,., cells arise instead by means of antigen- 
driven peripheral T,¢,-cell development. Further analysis of two of 
these TCRs by the creation of retroviral bone marrow chimaeras 
and a TCR transgenic line revealed that microbiota indigenous to 
our mouse colony was required for the generation of colonic T,eg 
cells from otherwise naive T cells. If T cells expressing these TCRs 
fail to undergo T,,g-cell development and instead become effector 
cells, they have the potential to induce colitis, as evidenced by 
adoptive transfer studies. These results suggest that the efficient 
peripheral generation of antigen-specific populations of T,.. cells 
in response to an individual’s microbiota provides important post- 
thymic education of the immune system to foreign antigens, 
thereby providing tolerance to commensal microbiota. 

Although T,., cells are required for the maintainenance of gut tol- 
erance*, commensal bacteria are not necessary for colonic Tyeg-Cell 
generation®® (Supplementary Fig. 1). Moreover, Treg cells from 
germ-free mice can protect against colitis’. In contrast, extrathymic 
generation of T,,., cells that respond to foreign antigens has been 
demonstrated with TCR transgenic models of oral tolerance*”. 
Peripheral T,.g-cell development is also increased in the gut””*, a res- 
ponse that is potentially related to the presence of specialized antigen- 
presenting cells”''"’. Finally, species in the Clostridium genus", and 
Bacteroides fragilis® by means of a protease-resistant capsular polysac- 
charide, can increase the frequency or function of colonic Treg cells, but 
may do so through innate immune receptors’’. Thus, it remains 
unclear whether the protective colonic T,eg-cell population is generated 
against self antigens or foreign antigens derived from the commensal 
bacteria found in each individual. 

Although TCR transgenic lines that respond to to antigens derived 
from commensal bacteria have been described'’, the normal in vivo 
frequency of those TCRs in the T;.g-cell subset as compared with the 
effector T-cell subsets is unknown. To study the TCRs normally found in 


the colonic T,.g-cell population, we analysed the colonic TCR repertoire. 
Asa result of the great diversity of the fully polyclonal TCR repertoire, we 
and others have used genetically engineered mice with limited polyclonal 
repertoires'”’. The analysis of TCR a-chain repertoires of CD4* T cells 
from the colonic lamina propria of mice expressing a fixed transgenic 
TCR chain revealed that Foxp3~ Treg cells use TCRs that are quite 
distinct from those of effector/memory (cp44h') and naive (cD44"°) 
Foxp3 cells (Fig. 1a,b and Supplementary Figs 2 and 3). This is illu- 
strated by using the Morisita-Horn similarity index (Fig. la), in which 
values from 0 to 1 represent low to high similarity between two data sets, 
and also by the analysis of the relative frequencies of individual common 
TCRs in each T-cell subset (Fig. 1b). Moreover, the analysis of TCR 
chains of cells pooled from the secondary lymphoid organs and colons of 
additional mice showed that T,.g TCR usage in the colon differed greatly 
from that in the other organs (Fig. 1c, d and Supplementary Fig. 3). Like 
the T,.g-cell population, the effector/memory T-cell population 
expressed TCRs largely unique to the colon; however, these two subsets 
showed very little overlap. Thus, consistent with our previous observa- 
tions in other peripheral locations”, these TCR repertoire data suggest 
that the colonic T,..-cell population is strongly shaped by the local 
antigenic milieu. 

To assess whether the local antigens were bacterial in origin, we 
expressed colon T,.. TCRs (Supplementary Fig. 3) in a hybridoma cell 
line that contains a green fluorescent protein (GFP) reporter for nuclear 
factor of activated T cells (NFAT) activation as a readout for TCR engage- 
ment’. We initially screened these hybridomas against autoclaved 
colonic contents and were surprised to find that many (five of eight) 
showed some degree of reactivity to preparations from conventionally 
housed, but not germ-free, mice (Fig. 2a, Table 1 and Supplementary 
Fig. 4a, b). Colonic contents from Jackson Laboratories-sourced (Jax) 
Ragl ‘~ mice were not recognized by four of these colonic Treg [CRs 
unless they were first housed together with mice from our colony. In 
contrast to the colonic T,eg TCRs, none of the eight abundant colonic 
activated/memory (CD44™) TCRs (Supplementary Fig. 5), or of four 
other TCRs tested (including B8 and TRS (ref. 20)) showed any reacti- 
vity (not shown). In addition, TCR recognition did not occur in the 
absence of dendritic cells and was blocked by antibodies against MHC 
class II (not shown). Thus, these data suggest that the antigens res- 
ponsible for TCR activation are derived from microbes that can be 
passed between mice that are housed together. 

We therefore attempted to identify bacteria recognized by these 
colonic T;eg TCRs by screening small pools of two or three heat-killed 
bacteria isolated in pure culture from the colonic contents of mice in 
our colony (Fig. 2b and Supplementary Figs 4c and 6). Two colonic 
Treg TCRs (CT6 and CT7) reacted to one or more pools. Testing of 
individual isolates from these pools revealed that CT6 reacted with 
isolate ACNA18.1, identified by 16S ribosomal RNA gene sequencing 
as a previously uncharacterized Clostridiales species. All three isolates 
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Figure 1 | The colonic T,.g TCR repertoire is unique. a, TCR usage between 
colonic T-cell subsets. A total of 2,892 TRAV14 TCRa sequences from colonic 
naive, memory and Teg cells of five individual mice were compared 
(Supplementary Fig. 2). Each symbol represents a Morisita—Horn similarity 
comparison between two different T-cell subsets within each mouse (top), or a 
comparison of the same T-cell subset between different mice (bottom). Bars 
represent means + s.e.m. b, Analysis of individual TCRs. The relative 
distribution within all T-cell subsets is shown for the 20 most common 
individual TCRs in each colonic T-cell subset. For example, a TCR with equal 
percentage in the Foxp3* and CD44™ subset would be shown as a half-green/ 
half-orange bar. This analysis uses the pooled data set, which includes 


recognized by CT7, but none of 34 other sequenced isolates, were 
identified as Parabacteroides distasonis (Fig. 2c and Supplementary 
Fig. 6). To assess whether CT7 broadly recognizes commensal species 
within the Parabacteroides/Bacteroides genuses, we screened it against 
an additional panel of closely related mouse-derived commensal 
Parabacteroides and Bacteroides species”. CT7, but not CT2 or CT6, 
recognized only a subset of these bacterial species, including a second 
isolate of P. distasonis (Fig. 2d). Importantly, isolates that did not 
stimulate CT7 were recognized by another TCR hybridoma, DP1 
(Supplementary Fig. 7a), indicating that these preparations contained 
antigens capable of stimulating TCRs. The almost mutually exclusive 
specificity of CT7 and DP1 within the Bacteroidaceae family makes it 
unlikely that these TCRs recognize self antigens in the host that are 
differentially induced by these closely related bacteria” (Supplemen- 
tary Fig. 7b). The TCR-specific reactivity patterns further suggest that 
TCR activation is not due to non-specific stimulation by generic 
immunostimulatory bacterial components or superantigens. Rather, 
these TCRs probably recognize distinct bacterial protein antigens, 
because predigestion of heat-killed P. distasonis with proteinase K abro- 
gated recognition by the CT7 hybridoma (Supplementary Fig. 7c), in 
contrast with a report of T,.,-cell induction by protease-insensitive 
capsular polysaccharide from B. fragilis’. Although proof of direct 
bacterial recognition will require the identification of specific epitopes, 
these data strongly argue for the recognition of a bacteria-derived 
peptide by colonic T,eg TCR CT7. 

More than half of the tested colonic T,eg TCRs recognized colonic 
contents and/or bacterial isolates (Table 1). However, this may under- 
estimate the true frequency of colonic T,eg TCRs that respond to 
bacterial antigens. A lack of reactivity in our screen cannot be inter- 
preted to mean that the TCR does not recognize bacteria, because the 
antigens may be rare in unfractionated colonic contents, lost on 
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sequences from individual mice as well as 9,680 sequences from experiments 
1-3, each consisting of cells from three to five mice (Supplementary Fig. 2). 
Note that one TCR is found in both Foxp3” (no. 15) and CD44" (no. 3) plots, 
and one in both CD44" (no. 5) and CD44™ (no. 12) plots; all others appear in 
only one plot. c, Anatomical distribution of colonic TCRs. Morisita-Horn 
indices comparing the colon data with those for other locations (filled symbols) 
or between each of the other locations (open symbols) are shown. Mes., 
mesenteric; cerv., cervical. d, Analysis of individual TCRs. The 20 most 
prevalent colon TCRs for each subset in the pooled data set are shown, and their 
presence at other locations is represented in a manner analogous to b. LN, 
lymph node. 


autoclaving, or derived from an organism that was not isolated in 
our screen. Thus, the specificity of common colonic T,.. TCRs seems 
to be skewed towards the recognition of bacterial antigens. 

Although T,., cells may develop extrathymically as a result of an 
encounter with bacterial antigens, it is also possible that these T,.. cells 
are selected by self-antigen recognition in the thymus, followed by 
expansion in the periphery due to cross-reactivity. To assess the ability 
of these colonic T,.¢ TCRs to facilitate thymic T,.,-cell selection, we 
tracked the development of immature Foxp3®? Rag] ‘~ thymocytes 
that were retrovirally transduced with a colonic T,.. TCR. None of the 
colonic Tyg TCRs generated an appreciable frequency of Foxp3* 
thymocytes (Fig. 3a and Supplementary Figs 8 and 9a), in contrast 
with T,.g TCRs normally found at other peripheral locations (R19, G25 
and R111; Supplementary Fig. 3). Note that G25 and R111 can be 
found at low frequency in the colonic T,.g-cell subset, suggesting that 
the colon does contain some thymically derived T,eg cells. The lack of 
Tyeg development in cells expressing colonic T,.. TCRs cannot be 
attributed to an overwhelmed thymic niche**** (Supplementary Fig. 
9b). These data therefore demonstrate that many common colonic Tyeg 
TCRs facilitate thymic Tyee cell selection poorly, if at all, implying that 
these TCRs instead mediate peripheral T,,.g-cell development. 

The retroviral transduction of thymocytes does not result in the 
emergence of sufficient numbers of transduced T cells from the thymus 
to permit their reliable detection in the periphery. We therefore retro- 
virally transduced self-renewing bone marrow progenitors and used 
them to create stable chimaeras, selecting colonic T,.g TCRs CT2 and 
CT6 on the basis of their in vitro reactivity to colonic contents (Fig. 2a). 
In these chimaeras we observed virtually no development of Foxp3 
expression in cells expressing CT2 or CT6 (Fig. 3b, left; Sup- 
plementary Fig. 10). We reasoned that CT2 and CT6 may not recognize 
the microbiota in these commercially sourced host mice (Fig. 2a), and 
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Pooled bacterial isolates or colon contents 


Figure 2 | In vitro reactivity of colonic T,.g TCRs to colonic contents and 
bacterial isolates. a, Reactivity to colonic contents. Colonic T,.g TCR- 
expressing NFAT-GFP hybridoma cells were cultured with Flt3L-induced 
dendritic cells in the presence of autoclaved food homogenate, or autoclaved 
colonic contents (CC) isolated from Rag! ~~ mice from Jackson Laboratories 
(Jax Ragl'~), Jax Ragl '~ mice housed together with mice from our colony 
(co-housed Ragl ‘~), germ-free mice and conventionally housed (conv.) mice 
in our colony. b, Reactivity to bacterial pools. Cultures of heat-killed 
commensal bacteria isolated from our colony (Supplementary Fig. 6) were 
pooled (denoted by culture conditions and a letter) and screened for their 


therefore performed experiments in which the chimaeras were housed 
together with mice from our colony. This resulted in the induction of 
CT2-expressing or CT6-expressing T,., cells preferentially localized in 
the colon (Fig. 3b, right, and Supplementary Fig. 11). In an observation 
paralleling that previously made in the thymus”, there also seems to 
be a saturable, antigen-specific T,..-cell niche in the periphery** (Sup- 
plementary Fig. 11c). 

Although these data strongly suggest that many colonic T,.g cells are 
generated extrathymically on encountering bacterial antigen, it 
remained possible that a rare population of thymically generated T,eg 
cells below the limit of our detection expanded on encountering peri- 
pheral antigen”. We therefore generated CT6 TCR transgenic mice 
(Supplementary Fig. 12) and adoptively transferred CD44'° Foxp3_ 
CT6 T cells mixed with congenic polyclonal CD4° ‘filler’ T cells into 
T-cell-deficient Tcrb~'~ hosts. Consistent with the bone marrow chi- 
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ability to stimulate colonic T,-. TCR-expressing hybridomas. For a, b, see 
Supplementary Fig. 4 for additional TCRs. c, Reactivity to individual isolates. 
Hybridomas showing reactivity against a pool of bacterial isolates were re- 
screened against the individual constituents (numbered). Data shown in a-care 
the fold change in percentage GFP” over the no-antigen control from 2-4 
experiments (means + s.e.m.). d, Specificity of colonic T,.g TCRs. A panel of 
heat-killed Parabacteroides and Bacteroides species (Supplementary Fig. 7) was 
tested against hybridomas expressing CT2, CT6 and CT7. Data shown are 
means = s.e.m. for three experiments. 


expression of Foxp3 only if the recipients were housed together with 
mice from our colony (Fig. 3c and Supplementary Fig. 13). Taken 
together with the observed lack of T,..-cell development by thymocytes 
expressing colonic Tyeg TCRs (Fig. 3a), these data suggest that a sub- 
stantial proportion of the colonic T;eg population arises extrathymically 
from antigen-specific interactions with the colonic microbiota. 

The notion that most colonic T,eg cells are generated as a result of 
microbial interactions is at odds with the observation that germ-free 
mice have normal T,eg-cell frequencies® (Supplementary Fig. 1). 
However, we and others have observed that most colonic Treg cells 
in conventionally housed, but not germ-free, mice are probably of 
peripheral origin, because these cells express low levels of the tran- 
scription factor Helios (Fig. 3d and Supplementary Fig. 14), a putative 
marker for thymically derived T,cg cells*°. We therefore speculate that 
germ-free conditions skew the colonic T,.g TCR repertoire towards 


maera data, the transferred CT6 T cells expanded and induced the — thymically derived T,.g TCRs. 

Table 1 | Summary of in vitro screening of colonic Treg-cell TCRs 

Colon Treg-cell TCR Reactivity 

Name CDR3 amino-acid residue sequence Conv. CC Transferred by co-housing Bacterial isolate 

CT1 AASWASGYNKLT Yes Yes 

CT2 AASAIWNTGYQNFY Yes Yes 

CT4 AASEYSALGRLH 

CT6 AASGYSALGRLH Yes Yes Clostridiales sp. ACNA18.1 
CT7 AASATGDNRIF Parabacteroides distasonis 
CT8 AASLTGGYKVV 

CT9 AASADNRAGNKLT Yes Yes 

G57 AASELYQGGRALI Yes 


Conv. CC, colonic contents from conventionally housed mice. 
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Figure 3 | Colonic T,.¢ TCRs facilitate thymic T,..-cell development poorly, 
if at all. a, Assessment of thymic T,eg-cell development from TCR-of- 
transduced Rag! '~ thymocytes. The gating strategy (left), and summary of 
two to four experiments per TCR (right) are shown. See Supplementary Fig. 3 
for additional TCR information, and Supplementary Figs 8 and 9 for plots and 
analysis of clonal frequencies. Comparison of colon versus other Tyeg TCRs 
revealed P-values < 0.01. b, Mixed retroviral bone-marrow chimaeras. The 
percentage of Foxp3™ cells in the CT2-expressing or CT6-expressing 
CD45.2* CD4* population is shown in hosts housed together (right) or not 
(left) with mice from our colony. See Supplementary Figs 10 and 11 for 
additional analyses. Periph., peripheral. c, Peripheral conversion of CT6 TCR 


The efficient differentiation of naive T cells into T,.,, rather than 
effector, cells may be important for generating colonic tolerance, 
because it has been observed that TCRs that facilitate thymic T,eg-cell 
development can be pathogenic when expressed on effector T cells'”””. 
To address this possibility, we performed an initial analysis of colonic 
TCR repertoires in mice expressing the fixed TCR f chain and under- 
going spontaneous colitis due to genetic deficiencies in interleukin 
(IL)-2, IL-10 or transforming growth factor-B receptor signalling 
(Supplementary Fig. 2a). We observed that several colonic TCRs 
almost exclusively found in the Foxp3* data sets in normal mice were 
found in the effector/memory data sets in the diseased animals (Sup- 
plementary Fig. 15a). Although these genetic manipulations may affect 
Tyeg-cell development or survival, the relatively high abundance of 
some of these TCRs in the CD44” subset suggests that effector cells 
expressing these TCRs are expanding in the colitic environment. To 
test for the pathogenic potential of colonic T,..'T'CRs, we retrovirally 
expressed CT2 and CT6 TCRs on peripheral, monospecific TCRaB 
transgenic cells with known specificity for a foreign antigen (human 
CLIP peptide). Adoptive transfer of these cells, which were virtually 
all Foxp3 , into Rag] ‘~ hosts housed with mice from our colony 
induced weight loss and colitis (Fig. 4 and Supplementary Fig. 15b, c). 
In contrast, cells expressing only the transgenic TCR or a TCR from 
the naive T-cell subset (B8) did not. The failure of these retrovirally 
transduced T cells to upregulate Foxp3 and become regulatory in this 
situation is probably due to expansion in a lymphopoenic environ- 
ment, as well as in vitro T-cell activation—a requirement for retroviral 
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transgenic cells. Naive CD45.2 CT6 cells and CD45.1 CD4" filler cells were 
adoptively transferred into Tcrb ‘~ hosts for 5 weeks. The percentage (left) of 
Foxp3* CT6 cells (Vo2*VB6*CD45.2* CD45.1- CD4*) are shown. The 
number of CT6 cells (right) was determined by flow cytometry of the entire 
colonic lamina propria. Data are from three experiments; bars represent 
means + s.e.m. See Supplementary Fig. 13 for flow cytometric plots. d, Helios 
expression in T,¢g cells. Representative intracellular Helios staining in CD4* 
Foxp3° cells from conventionally housed (conv.) and germ-free (GF) Foxp3®” 
mice is shown, and is summarized in Supplementary Fig. 14. For all plots in this 
figure, each symbol represents data from an individual host. 


transduction. Thus, these data illustrate the potential pathological con- 
sequences of T-cell recognition of commensal bacterial antigens under 
conditions that disfavour T,.. development. 

This analysis of common colonic T,eg TCRs in a fixed TCRB 
repertoire suggests a model (Supplementary Fig. 16) in which T cells 
expressing these TCRs exist as naive T cells in the absence of antigen 
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Figure 4 | Pathogenic potential of colonic T,.g TCRs. a, Adoptive transfer of 
peripheral T cells transduced with CT2 or CT6 into Rag!” '~ hosts housed with 
mice from our colony. Non-transduced T cells (none) or naive TCR transduced 
T cells (B8) were used as controls. Each line represents an individual recipient. 
One representative experiment is shown (summary in Supplementary 

Fig. 15b). b, A representative haematoxylin/eosin-stained section of the 
descending colon 7-10 weeks after T-cell transfer. Original magnification 4. 
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(Fig. 3b, left, and Supplementary Figs 10 and 11). Encounter with 
bacteria-derived foreign antigens in the colon seems to drive the 
generation of Foxp3~ Treg Cells efficiently (Fig. 3b), because it typically 
does not result in the substantial simultaneous development of CD44™ 
cells of the same specificity (Fig. 1a, b). This diversion of naive T cells 
with bacterial TCR specificity into the T,..-cell lineage may be crucial 
for preventing the generation of colitogenic effector cells (Fig. 4). Thus, 
these data support a model in which an individual’s T-cell population 
is not only instructed by classic self/non-self discrimination mechan- 
isms during thymic development but is also educated in the periphery 
to accommodate the variety of non-self antigens derived from the 
commensal microbiota at mucosal sites. 


METHODS SUMMARY 

Mice. TCli TCRB Foxp3®? Tcra*!~; Foxp3'**SS?; q1-2-'~; I-10‘; and 
dnTGFBRII strains have been described (see Methods). C57BL/6 Ragl! ~ and 
CD45.1 mice were obtained from Jackson Laboratories and the National Cancer 
Institute, respectively. Germ-free mice were generated in collaboration with J. 
Gordon. CT6 transgenic mice were generated as described”’. 

TCR repertoire analysis. Analyses of TRAV 14 (Va2) TCR sequences from TCliB 
transgenic mice were performed as described”®. Lamina propria cell suspensions 
from the entire colon were prepared as described’, and CD4* subsets were sorted 
with a FACSAria (Becton Dickenson). 

Hybridoma assays. Hybridoma cells expressing GFP under an NFAT promoter” 
were retrovirally transduced with TCli TCRB-IRES-mCD4 and an individual TCR 
« chain. Hybridomas were cultured with flt3-ligand-elicited dendritic cells with 
the indicated antigen preparations and analysed by flow cytometry after 1.5 days. 
Antigen preparations. Whole colonic contents and food pellets were diluted with 
phosphate-buffered saline, vortex-mixed, homogenized, filtered, and autoclaved 
for 15 min. Colonic bacterial isolation was performed as described” (see Sup- 
plementary Fig. 6). 

In vivo developmental assays. Retroviral transduction and intrathymic transfer 
of Ragl ~'~ thymocytes were performed as described in Methods. Analysis of CD4 
SP thymocytes was performed about 2.5 weeks later. Retroviral bone marrow 
chimaeras were created as described”’. Some recipients were housed together with 
mice from our colony 2 weeks after bone marrow reconstitution, for a period of 
1 week. 

In vivo peripheral T-cell assays. Retroviral transduction of peripheral TCli-«B 
Foxp3®? Rag1~/~ T cells was performed as described”’, and cells were transferred 
intravenously into Ragl ~~ hosts housed with them. Sorted CD4*CD44"° Foxp3 — 
cells (5X 10°) from CT6 transgenic mice were co-transferred with 5 x 10° 
CD45.1*CD4* ‘filler’ cells into Terb~'~ mice. Recovered cells were analysed by 
flow cytometry 5 weeks later. 

Statistics. The Wilcoxon rank sum test was used unless otherwise indicated. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 29 November 2010; accepted 5 August 2011. 
Published online 21 September 2011. 


1. Josefowicz, S.Z. & Rudensky, A. Control of regulatory T cell lineage commitment 
and maintenance. /mmunity 30, 616-625 (2009). 

2. Belkaid, Y. & Tarbell, K. Regulatory T cells in the control of host-microorganism 

interactions. Annu. Rev. Immunol. 27, 551-589 (2009). 

3. Barnes, M. J. & Powrie, F. Regulatory T cells reinforce intestinal homeostasis. 

Immunity 31, 401-411 (2009). 

4. Backhed, F., Ley, R. E., Sonnenburg, J. L., Peterson, D. A. & Gordon, J. |. Host- 

bacterial mutualism in the human intestine. Science 307, 1915-1920 (2005). 

5. in, B. etal. Gut flora antigens are not important in the maintenance of regulatory 

T cell heterogeneity and homeostasis. Eur. J. Immunol. 37, 1916-1923 (2007). 

6. Round, J. L. & Mazmanian, S. K. Inducible Foxp3* regulatory T-cell development 

by a commensal bacterium of the intestinal microbiota. Proc. Natl Acad. Sci. USA 

107, 12204-12209 (2010). 


LETTER 


7. Singh, B. et al. Control of intestinal inflammation by regulatory T cells. /mmunol. 
Rev. 182, 190-200 (2001). 

8. Curotto de Lafaille, M. A. et al. Adaptive Foxp3* regulatory T cell-dependent and 
-independent control of allergic inflammation. /mmunity 29, 114-126 (2008). 

9. Sun, C. M. et al. Small intestine lamina propria dendritic cells promote de novo 
generation of Foxp3 Treg cells via retinoic acid. J. Exp. Med. 204, 1775-1785 
(2007). 

10. Zheng, Y. et al. Role of conserved non-coding DNA elements in the Foxp3 gene in 
regulatory T-cell fate. Nature 463, 808-812 (2010). 

11. Coombes, J. L. etal. A functionally specialized population of mucosal CD103* DCs 
induces Foxp3* regulatory T cells via a TGF-B and retinoic acid-dependent 
mechanism. J. Exp. Med. 204, 1757-1764 (2007). 

12. Mucida, D. et al. Reciprocal T417 and regulatory T cell differentiation mediated by 
retinoic acid. Science 317, 256-260 (2007). 

13. Benson, M. J., Pino-Lagos, K., Rosemblatt, M. & Noelle, R. J. All-trans retinoic acid 
mediates enhanced Treg cell growth, differentiation, and gut homing in the face of 
high levels of co-stimulation. J. Exp. Med. 204, 1765-1774 (2007). 

14. Atarashi, K. et a/. Induction of colonic regulatory T cells by indigenous Clostridium 
species. Science 331, 337-341 (2011). 

15. Round, J. L. et al. The Toll-like receptor 2 pathway establishes colonization by a 
commensal of the human microbiota. Science 332, 974-977 (2011). 

16. Cong, Y., Feng, T., Fujihashi, K., Schoeb, T. R. & Elson, C.0. A dominant, coordinated 
T regulatory cell-lgA response to the intestinal microbiota. Proc. Natl Acad. Sci. USA 
106, 19256-19261 (2009). 

17. Hsieh, C.-S. et al. Recognition of the peripheral self by naturally arising 
CD25* CD4* T cell receptors. Immunity 21, 267-277 (2004). 

18. Pacholczyk, R., Ignatowicz, H., Kraj, P. & Ignatowicz, L. Origin and T cell receptor 
diversity of Foxp3 *CD4*CD25* T cells. Immunity 25, 249-259 (2006). 

19. Wong, J. et al. Adaptation of TCR repertoires to self-peptides in regulatory and 
nonregulatory CD4* T cells. J. Immunol. 178, 7032-7041 (2007). 

20. Lathrop, S.K., Santacruz, N. A., Pham, D., Luo, J. & Hsieh, C. S. Antigen-specific 
peripheral shaping of the natural regulatory T cell population. J. Exp. Med. 205, 
3105-3117 (2008). 

21. Ise, W. etal. CTLA-4 suppresses the pathogenicity of self antigen-specific T cells by 
cell-intrinsic and cell-extrinsic mechanisms. Nature /mmunol. 11, 129-135 
(2010). 

22. Bloom, S.M.eta/. Commensal Bacteroides species induce colitis in host-genotype- 
specific fashion in a mouse model of inflammatory bowel disease. Cel! Host 
Microbe 9, 390-403 (2011). 

23. Bautista, J. L. et al. Intraclonal competition limits the fate determination of 
regulatory T cells in the thymus. Nature Immunol. 10, 610-617 (2009). 

24. Leung, M. W., Shen, S. & Lafaille, J. J. TCR-dependent differentiation of thymic 
Foxp3* cells is limited to small clonal sizes. J. Exp. Med. 206, 2121-2130 (2009). 

25. Nishio, J., Feuerer, M., Wong, J., Mathis, D. & Benoist, C. Anti-CD3 therapy permits 
regulatory T cells to surmount T cell receptor-specified peripheral niche 
constraints. J. Exp. Med. 207, 1879-1889 (2010). 

26. Thornton, A. M. et al. Expression of Helios, an Ikaros transcription factor family 
member, differentiates thymic-derived from peripherally induced Foxp3* T 
regulatory cells. J. Immunol. 184, 3433-3441 (2010). 

27. Hsieh, C. S., Zheng, Y., Liang, Y., Fontenot, J. D. & Rudensky, A. Y. An intersection 
between the self-reactive regulatory and nonregulatory T cell receptor repertoires. 
Nature Immunol. 7, 401-410 (2006). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank K. Murphy, T. Egawa, Y. Zheng, J. Scott-Browne, 

J. Fontenot and S. Wetzel for discussion and reading of the manuscript; A. Kau and 
J. Gordon for discussions and generation of germ-free animals; N. P. Malvin for 
assistance with bacteriology; and J. Hunn for technical assistance. C.S.H. and 
co-workers are funded by the National Institute of Allergy and Infectious Diseases and 
the Burroughs-Wellcome Fund. S.M.B. was supported by National Institutes of Health 
training grant 5T32AI0071632. 


Author Contributions S.K.L., S.R., K.N. and N.S. performed most of the experiments. 
S.M.B. designed and performed the bacteriology. C.W.L. developed and assisted with 
the intrathymic transfer experiments. D.P. and T.S. were involved in study design. S.K.L. 
and C.S.H. designed the experiments and wrote the manuscript. All authors discussed 
the results and commented on the manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to C.S.H. (chsieh@wustl.edu). 


00 MONTH 2011]! VOL 000 | NATURE|5 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


METHODS 

Mice. TCli TCRB Foxp3®? Tcra*'~ (ref.27), Foxp3'R#S-GFP (ref. 28), II-2-'~ (ref. 29), 
Il-10~'~ (ref. 30) and dnTGEBRII (ref. 31) strains on C57BL/6 background have been 
described. C57BL/6 Rag] ‘~ and C57BL/6 CD45.1 congenic mice were obtained 
from Jackson Laboratories and National Cancer Institute, respectively. Germ-free 
mice were generated in collaboration with J. Gordon. CT6 transgenic mice were 
generated as described”, with microinjection into B6 x 129 fertilized eggs. In experi- 
ments in which animals obtained from commercial vendors were housed together, a 
female bred in our facility was added to each cage of mice for 7 days, and then 
removed before the experiment. In retroviral bone marrow chimaeras, the animal 
was added 2 weeks after bone marrow transfer, for 1 week, then removed for the 
duration of the experiment. Animal experiments were performed in a specific patho- 
gen-free facility in accordance with the guidelines of the Institutional Animal Care 
and Use Committee at Washington University. 

TCR repertoire analysis. Analyses of TRAV 14 (Va2) TCR sequences from TClif 
transgenic mice were performed as described”. In brief, lamina propria cell sus- 
pensions were prepared from the entire colon (including the caecum) as 
described’, without the final Percoll enrichment, and sorted into CD4* T-cell 
subsets (CD44"'Foxp3”, CD44'°Foxp3” and Foxp3") with a FACSAria (Becton 
Dickinson). RNA was isolated, TCRa cDNA was isolated, and a TRAV14 (Va2) 
cDNA library was generated by PCR and sequenced by the Genome Sequencing 
Center at Washington University. Comparison of TCR repertoires was performed 
with the Morisita—Horn statistical test**, which compares two populations, taking 
into account the overlap and relative abundance of the species in the two popula- 
tions, and expresses their similarity on a scale from 0 (no similarity) to 1.0 (exactly 
the same). 

Hybridoma assays. T-cell hybridoma cells, which do not express a T-cell receptor 
and express GFP under an NFAT promoter”, were retrovirally transduced with 
TCli TCRB-IRES-mCD4 and sorted for CD4 expression. TCR « chains of interest, 
selected on the basis of average frequency and presence in at least two independent 
data sets, were then retrovirally introduced, and Vo2-expressing cells were sorted. 
The hybridomas differ only in their retrovirally encoded TCR « chain, which does 
not confer any functional properties of the cell from which the TCR originated, 
thereby allowing a direct comparison of antigen recognition between TCRs from 
different T-cell subsets. These hybridoma cells ((2.5-5.0) X 10*) were cultured 
with flt3-ligand-elicited dendritic cells (5 X 10°) and the indicated antigen pre- 
parations, diluted 1:200, in flat-bottomed 96-well plates. The Va2tCD4* cells 
were analysed for GFP expression after 1.5days by flow cytometry on a 
FACSCanto or FACSAria (Becton Dickenson). Controls without antigen were 
performed in parallel and used to normalize data against variations in background 
fluorescence between experiments. 

Antigen preparations. Whole colonic contents were scraped from longitudinally 
opened colon and caecum; they were diluted with phosphate-buffered saline, then 
vortex-mixed, homogenized, filtered through a 70-um mesh, and autoclaved for 
15 min. Food antigen was also diluted with phosphate-buffered saline, homogenized, 
filtered and autoclaved. Preparations were stored short-term at —20 °C and long- 
term at -80 °C. For a description of the colonic bacterial isolation see Supplementary 
Fig. 6 and ref. 22. Isolates were named according to their culture conditions followed 
by a number; pools of two or three isolates were designated by a letter. 

Thymic T,..-cell developmental assay. Retroviral transduction of Foxp3®Ra ig] nee 
thymocytes was performed as described*’ by using a TCRa-P2A-TCR vector™. 
Thymocytes were injected into the thymus of sublethally irradiated (600 rad) 
CD45.1 congenic recipients; 14-16 days later, CD45.2* CD4*CD8~ thymocytes 
were analysed by flow cytometry for expression of Foxp3®? with the use of a 
FACSAria. About half of each thymus (8 X 10’) was analysed by flow cytometry. 


A total of 10° events were initially collected to determine the frequency of CD45.1 
and CD45.2 cells. Subsequently, the storage gate was changed to include only CD45.2 
cells for the rest of the sample at high speed (about 25,000-30,000 events s~ 1) Ofthe 
45 recipients of cells expressing colonic T,.g TCRs, 4 of 45 showed a handful of 
Foxp3”* events (1/5 G57, 1/5 CT9, 2/5 CT2). We believe that many or all of these may 
be artefacts, because the Foxp3* events shown in Supplementary Fig. 8b have a 
larger than usual Forward Scatter (FSC) and Side Scatter (SSC), nor can they be 
found in ten CT2 retroviral bone marrow chimaeras (Supplementary Fig. 10, top). 
However, we cannot be certain that these events are artefacts, because they are few 
and not reproducibly observed in all recipients. The sensitivity for picking up 
Foxp3* cells may be estimated from the binomial distribution, in which a population 
frequency of 0.3% would result in at least 1 event per 1,000 with 95% confidence, and 
can be calculated from the number of CD4SP events processed (Supplementary 
Fig. 9b, bottom). However, increasing the number of CD4SP cells expressing a 
particular TCR may not increase the sensitivity of the assay, because it seems that 
Teg T'CRs often show an inverse correlation between clonal frequency and T, .,-cell 
development”*™*. 

Retroviral bone marrow chimaeras. Foxp3'8tS-GFP Ragl ~/~ (CD45.2) bone 
marrow was retrovirally transduced with the CT2 or CT6 TCR using a TCRa- 
P2A-TCRB vector* and injected intravenously with wild-type CD45.1 bone 
marrow into lethally irradiated (1,050 rad) CD45.1 hosts from NCI to create 
chimaeras as described’’. Some recipients were housed together (co-housed) with 
mice from our colony for a period of 1 week, beginning 2 weeks after bone marrow 
reconstitution. Cells isolated from the thymus, spleen, mesenteric lymph node and 
colon were analysed by flow cytometry after 6-8 weeks. 

In vivo TCR-induced pathology. Retroviral transduction of in vitro activated 
peripheral TCli-aB Foxp3®? Ragi~'~ T cells was performed as described’’. The 
entire T-cell population was transferred, and adjusted such that 6 X 10° trans- 
duced cells (14% mean transduction efficiency, range 5.8-22%) were intrave- 
nously transferred into each co-housed Ragl ~’~ mouse. After 7-10 weeks, 
tissue from the caecum and colon underwent fixation and staining with haema- 
toxylin and eosin. 

CT6 TCR transgenic cell transfer experiment. Naive (CD4*CD44"Foxp3_) T 
cells were sorted from lymph node and spleen from CT6 Foxp3'®"* CTP 
Ragl ‘mice. Cells (5X 10*) were transferred simultaneously with 5X 10° 
CD45.1*°CD4* congenic ‘filler’ cells into Tcrb ‘~ mice, with the notion that 
providing filler cells, which include T,.g cells, may limit lymphopenic expansion 
and facilitate conversion. VB6' Va2*CD45.2*CD45.1- CD4* CT6 Tg T cells in 
the colonic lamina propria were assessed for Foxp3 expression 5 weeks later. 
Statistics. The Wilcoxon rank sum test was used unless otherwise indicated. 
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The vomeronasal organ (VNO) hasa key role in mediating the social 
and defensive responses of many terrestrial vertebrates to species- 
and sex-specific chemosignals’. More than 250 putative pheromone 
receptors have been identified in the mouse VNO””, but the nature 
of the signals detected by individual VNO receptors has not yet been 
elucidated. To gain insight into the molecular logic of VNO detec- 
tion leading to mating, aggression or defensive responses, we sought 
to uncover the response profiles of individual vomeronasal recep- 
tors to a wide range of animal cues. Here we describe the repertoire 
of behaviourally and physiologically relevant stimuli detected by a 
large number of individual vomeronasal receptors in mice, and 
define a global map of vomeronasal signal detection. We demon- 
strate that the two classes (V1R and V2R) of vomeronasal receptors 
use fundamentally different strategies to encode chemosensory 
information, and that distinct receptor subfamilies have evolved 
towards the specific recognition of certain animal groups or chemical 
structures. The association of large subsets of vomeronasal receptors 
with cognate, ethologically and physiologically relevant stimuli 
establishes the molecular foundation of vomeronasal information 
coding, and opens new avenues for further investigating the neural 
mechanisms underlying behaviour specificity. 

The discovery of large receptor families mediating olfactory and 
vomeronasal chemosensation has offered a unique opportunity to 
decode the molecular logic by which environmental information 
influences animal behaviour**. The VNO of rodents has a critical role 
in identifying sex- and species-specific chemical cues and in mediating 
mating, territorial aggression, defensive responses to predators and 
associated endocrine changes’. With rare exceptions® *, the molecular 
identity of VNO receptors (VRs) recognizing distinct animal cues is 
unknown, thus limiting the ability to explore the sensory mecha- 
nisms underlying behavioural specificity. Prior studies suggested that 
vomeronasal detection is extremely sensitive and narrowly tuned to 
male, female or heterospecific cues*®"’, but they have not allowed the 
identification of the activated receptors. We describe here a robust and 
high-throughput molecular readout of vomeronasal activation that 
enabled us to uncover the receptor specificity of 88 individual VRs 
to a vast range of ethologically relevant cues. These results establish 
the molecular and functional framework underlying vomeronasal 
signalling. 

In initial experiments, we exposed female mice to clean bedding and 
to bedding used by male mice, and assessed the upregulation of the 
immediate early genes (IEGs) Arc, c-Fos, c-Jun, Egr1, FosB and Nr4al by 
in situ hybridization on VNO tissue. Our data show that the sensitivity 
of Egr1 induction following exposure to chemosignals far exceeds that 
of other IEGs (Fig. la, b) (60.1+7.1 cells per 0.2mm? for Egrl, 
7.9 + 1.9 cells per 0.2mm‘ for c-Fos). Indeed c-Fos, an IEG used in 
previous VNO stimulation studies, labels only a subset of Egr1-positive 
cells (Supplementary Fig. 1). In TrpC2~‘~ mutants, in which VNO 
activation is genetically impaired’’, Egr1 induction after semiochemical 
exposure is completely abolished (n = 3), demonstrating the specificity 
of Egr1 activation following sensory stimulation (Fig. 1c). 


We then exposed animals to 29 distinct ethologically relevant 
cues”’*, Male and female bedding from different mouse subspecies 
and wild-derived strains, as well as a variety of heterospecific cues 
from sympatric competitors and predators, robustly induced Egr1 
expression in the VNO (Fig. 2a). Remarkably, food-related insect 
stimuli and cues from presumably neutral species such as woodchuck 
failed to generate VNO activation. 

V1R and V2R neurons were equally activated by a large variety of 
stimuli, as judged by co-labelling of Egr1 with Gx;2,a marker of V1R- 
expressing neurons'*’° (Fig. 2b, Supplementary Fig. 2a). Interestingly, 
simultaneous exposure to multiple cues from the same class of animals 
(for example, Peromyscus species, reptiles, or predatory birds) did not 
significantly increase the number of Egr1-positive cells when com- 
pared to activation by a single stimulus (P > 0.4, two-tailed t-test when 
the strongest of each stimulus class was compared to the correspond- 
ing mix), indicating that neuronal populations activated by related 
animals are largely overlapping (Fig. 2a). In contrast, simultaneous 
exposure to all heterospecific stimuli significantly increased Egr1- 
positive cells from 5% to 10% per cue to up to ~30% (P<0.01, 
two-tailed ¢-test), indicating that distinct heterospecific cues have dif- 
ferent response profiles. Moreover, whereas mouse bedding activated 
5% to 7% of VNO neurons in animals of the opposite sex, mixes of 
conspecific and heterospecific scents activated ~35% of neurons 
(Fig. 2a), suggesting that receptors activated by both types of cues 
are also largely distinct. 
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Figure 1 | Egr1 expression is robustly induced by pheromone-evoked VNO 
neuronal activation. Female CD-1 mice were exposed to clean or male mouse 
bedding and their VNOs analysed for expression of various immediate early 
genes (IEGs). a, In situ hybridization with RNA probes to Arc, c-Fos, c-Jun, 
Egr1, FosB and Nr4a1.b, Numbers of IEG-positive cells after bedding exposure 
(10 sections per VNO, n = 3 animals). Error bars, s.e.m. c, TrpC2, a cation 
channel involved in VNO signal transduction is required for Egr1 induction. 
Female TrpC2*'~ or TrpC2”‘~ mice were exposed to male conspecific bedding 
and Egr1 expression was visualized in the VNO. Scale bar, 100 um. 
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To assess Egr1 as a readout of VNO activation, we compared it to 
cue-evoked neuronal responses visualized by the genetically encoded 
calcium indicator, G-CaMP3 (ref. 16). Strikingly, Egr1 and G-CaMP3 
reported remarkably similar patterns of activities in the basal, or basal- 
plus-apical VNO neuroepithelium following exposure to rat and snake 
stimuli, respectively (Fig. 2c-e), confirming Egr1 induction as an 
exquisitely sensitive and accurate marker of VNO neuronal activation. 
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Figure 2 | Widespread activation of VNO receptors by conspecific and 
heterospecific cues. a, Survey of ethologically relevant vomeronasal stimuli. 
Vomeronasal neural activation on exposure to conspecific and heterospecific 
cues was visualized by Egr1 induction and quantified. Detection of female cues 
by males is designated as 9(0’). Unless specified, female mice were used. Mixed 
heterospecific cues activated Egr1 in significantly more cells than individual 
stimuli (P < 0.01, two-tailed t-test). Co-exposure to heterospecific and 
conspecific stimuli (all mix, n = 6) resulted in significantly more Egr1-positive 
cells (P < 0.05, two-tailed t-test). b, Widespread activation of VNO neurons by 
conspecific and heterospecific cues. Shown are in situ hybridization results with 
probes against Gz; (red) and Egr1 (green). c, Comparison between Egr1 and 
G-CaMP3-evoked signal in response to rat or milk snake chemosignals. 
G-CaMP3 images are 10-s averages of AF frames within stimulus period. 

d, Differential response profiles of neurons to rat or snake signals. Stimuli were 
perfused from 20s to 60s. e, Quantitative comparison between Egr1 and 
G-CaMP3-evoked signals. The percentage of activated cells identified by 
G-CaMP3 (n = 356 cells for rat stimuli, n = 566 cells for snake stimuli, 9 VNO 
slices from 3 animals) among those responsive to 40 mM KC] was plotted in the 
graph. Data for Egr1 were taken from a. The difference between Egr1 and 
G-CaMP3 was not statistically significant (P > 0.1, two tailed t-test). f, Clade- 
level maps of V1R (left) and V2R (right) activation show distinct clade 
specificity for male, female or heterospecific cues. Hatched patterns indicate 
response to multiple types of cues. Error bars, s.e.m. Scale bars, 100 um. 


Next, we developed a high-throughput platform to uncover the 
receptors activated by specific cues. With the exception of widely 
expressed V2R2 receptors’’, vomeronasal neurons are thought to 
express a unique receptor gene from the V1Rs or V2Rs. We generated 
209 RNA probes that specifically identify individual or subgroups of 
VRs by in situ hybridization. A collection of clade-specific probes was 
designed to target all receptor sequences within each of the eight 
distinct VIR or V2R clades (Fig. 2f). Probes with higher specificity 
that readily distinguish a single or few closely related VR sequences 
were designed on the basis of divergent 5’-UTR/intron’’ and 3’-UTR 
regions in VR genes. The specificity of these probes for closely related 
VRs was validated by dual colour in situ hybridization (Supplementary 
Fig. 3). Although detecting all VRs at single gene resolution was tech- 
nically impossible, our probes targeted 139 distinct VRs with a specifi- 
city of a single (or at most a few) gene. 

We then used a hierarchical approach to systematically uncover 
VRs activated by distinct cues (Supplementary Figs 2b, 4). First, the 
co-expression of Egr1 with either Gx;2, Gx, or formyl peptide receptors 
(FPRs)'°”° identified the nature of the activated neurons as expressing 
a V1R, V2R or FPR, respectively. Most stimuli activated both V1R- and 
V2R-expressing neurons, while a few activated only V1R- (hawk and 
owls) or V2R-expressing cells (rat, fox and male mouse cues in 
females) (Supplementary Table 1). We found no activation of FPR- 
expressing cells. We then assessed the specific VIR or V2R clades 
associated with the activated neurons (Fig. 2f, Supplementary Fig. 2c). 
Interestingly, some clades appeared specifically stimulated by a distinct 
class of cues, for example, V1Rd and V2R clades 4 and 7 by sex-specific 
cues. Subsequently, receptor specific probes were used to unmask the 
exact molecular identity of the Egr1-positive cells. By collecting data 
from 9,948 VNO slices, each containing approximately 1,000 neurons, 
we succeeded in the identification of 88 receptors (56 V1Rs and 32 V2Rs, 
78 single and 10 unresolved receptors) associated with distinct cues 
(Supplementary Fig. 5, Supplementary Table 1, 2). Importantly, these 
receptors span most V1R and V2R clades, thus generating the most 
comprehensive functional map of vomeronasal receptors to date. 

The vomeronasal system plays an essential part in regulating sex- 
specific behaviours. Previous reports suggest that vomeronasal neu- 
rons detect sex-specific cues in mouse urine, tear and saliva?'!®"%?!?, 
and Vmn2rl16 (or V2Rp5) was identified as detecting the male phero- 
mone ESP1 (ref. 6; Supplementary Fig. 6). Our strategy uncovered 28 
receptors (25 single, 3 unresolved) detecting mouse cues, among which 
26 detecting sex-specific cues (Fig. 3a—c, Supplementary Table 1). Only 
two receptors (V1ri9, V1ril0) responded to both male and female 
mouse cues, consistent with the desensitization of IEG induction in 
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Figure 3 | Receptor responses to sex-specific cues. a, b, Male and female 
mouse cues are each detected by a specific subset of VIRs and V2Rs. a, Heat 
maps representing the co-localization between Egr1 and representative 
vomeronasal receptor genes (yellow, 100% overlap; dark blue, 0% overlap). b, In 
situ hybridization of Egr1 (green) and individual receptors (red), with arrows 
marking co-localization of Egr1 and receptor signals. Scale bars, 100 um. 

c, Clade organization of V2Rs detecting male (blue) or female (red) cues. 

d, Receptors detecting male (blue), female (red) and heterospecific (green) cues 
are largely distinct. e, V1Rs and V2Rs display distinct specificity. Shown are the 
numbers of receptors that detect unique types of cues (specific) versus multiple 
types (promiscuous) among the following categories: male, female, mammalian 
non-predator, mammalian predator, reptile, and avian predator. 


vivo by self-secreted stimuli®. Four receptors (V1re2, V1re3, V1re6, 
V 1rg6) were selectively activated by female cues in males and females, 
while a larger set of VI1Rs and V2Rs responded to female cues only in 
males (Fig. 3a—c, Supplementary Table 1). In addition, responses to 
male-specific signals involved Vmn2rll6, Vmn2r28, Vmn2rl5, 
Vmn2rl6 and Vmn2rl7 in males and females, Vmn2r66 and 
Vmn2r82 in females, and Vmn2r84/85/86/87 and Vmn2r88 in males 
(Fig. 3a—c, Supplementary Table 1). Interestingly, no V1R was found to 
specifically respond to male cues. Thus, consistent with a previous 
report”, the detection of sex-specific cues appears to rely on a small 
and specific subset of VNO neurons, the identity of which is now 
clearly established. This molecular logic is likely to underlie the ini- 
tiation of sex-dependent behavioural interactions, such as male-male 
aggression and mating behaviours. 

Vomeronasal detection of heterospecific cues, or kairomones, is 
involved in adaptive defensive behaviours®'*”’. Indeed, rat bedding 
induces robust avoidance to the predator cues in TrpC2‘’~ but not in 
TrpC2~'~ animals (Fig. 4g, h). Moreover, TrpC2-‘~ animals exhibited 
abnormal ingestive behaviour of the predator bedding, suggesting that 
VNO inputs also inhibit foraging’*** (Supplementary Fig. 7). 

We report here the identity of 71 (63 single, 8 unresolved) receptors 
activated by heterospecific scents. Consistent with the distinct beha- 
vioural outputs generated by pheromones and kairomones, we found 
that only 11 receptors were common to both types of cues, whereas 60 
were uniquely activated by heterospecific stimuli, and 17 by mouse cues 
only (Fig. 3d). The detection of kairomones thus emerges as a major 
function of the VNO*". The identity of one of the identified receptor 
population for the detection of predator cues was confirmed indepen- 
dently by Egr1 activation in cells expressing YFP under the V1Rh7 
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Figure 4 | Receptor responses to heterospecific cues. a, b, Predator cues are 
detected by a specific subset of V1Rs and V2Rs. a, Heat map representing the co- 
localization between Egr1 and representative vomeronasal receptor genes 
(colour coding as Fig. 3a). b, In situ hybridization of Egr1 (green) and 
vomeronasal receptors (red), with arrows marking co-localization of Egr1 and 
receptor signals. Scale bar, 100 jum. b, c, Mammalian predator cues commonly 
activate V2R clade 5 receptors. Owing to high homology among V2R clade 5 
genes, the Vinn2r30, 33, 34, 39 probes detect multiple receptors. d, Fluorescence 
image showing a patched V1rh7-YFP neuron. e, Loose-patch recordings of a 
Virh7-YFP neuron. The arrow indicates perfusion start. f, Spike raster for three 
different VNO neurons, showing responses ofa V1rh7-YFP neuron to ferret, but 
not to rat stimuli, and no response of a V1Rh7-YFP-negative neuron to ferret 
stimuli. The stimulus perfusion started at —30 s and lasted 20 s. Be h, Rat bedding 
(arrow) elicits robust avoidance behaviours in control TrpC2*’~ mice, but 
significantly less in TrpC2'~ mice lacking VNO activity. ***P < 0.0001 (two- 
tailed Student’s t-test). Error bars, s.e.m. (Trpc2*", n= 13; TrpC2 ‘, n=17). 


promoter’® (Supplementary Fig. 8). Further, loose patch recording of 
V1Rh7-YFP expressing neurons demonstrated significant increase in 
firing rates following exposure to ferret, but not to rat stimuli 
(1.732 + 0.170Hz for ferret, 0.420+0.061 Hz for rat, n=4) 
(Fig. 4d-f, Supplementary Fig. 9). 

Remarkably, some receptors show unique association with distinct 
classes of predators. Vmn2r89 and Vmn2r121 were exclusively acti- 
vated by scents from snakes, V1rc10/11/12 by owls. Also, up to 70% of 
V2R clade 5 neurons were activated by every mammalian predator 
tested, but not by sympatric non-predators (Fig. 4a—c, Supplementary 
Fig. 5, 10). Moreover, each predator cue generated a distinct receptor 
signature: for example, rat stimuli activate Vmn2r59, Vmn2r60, 
Vmn2r6l, Vmn2rl08 and Vmn2r110, all within clade 8, whereas ferret 
cues activate V1rf5 and Vmn2r77/78/79, suggesting that the mouse 
VNO has the sensory machinery to discriminate predator species. 

We then searched for receptors detecting the sympatric species 
Mus spicilegus and Mus musculus, which diverged evolutionarily 
~1.5 million years ago and do not breed in the wild’’’*. Receptors 
activated by M. spicilegus and M. musculus male cues appear mostly 
distinct, though often closely related (Supplementary Figs 5, 11). For 
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Figure 5 | Sulphated steroids detection by VIRs. a, V1Ref and V1Rjk clade- 
specific probes (red) co-localize with Egr1 (green) after VNO stimulation by a 
mix of steroids containing glucocorticoids such as Q1570 (green), oestrogens 
such as E1050 (red) and androgens such as A7864 (blue). Each of these 
compounds on its own elicits activity in distinct populations of vomeronasal 
neurons (V1re2, V1re6, V1rf3 and V11j2), also represented in the molecular 
tree of V1R receptors (b). Specific receptors detecting each steroid are indicated 
by dots, using the same colour scheme as a and c. c, The three distinct 
oestradiols (red) activate both V1rf3 and V1rj2 whereas the oestriol (purple) 
only activates V1rf3. d, The sulphate group position in pregnenes 
(corticosteroids in green, pregnenolone in orange) determines the specificity of 
ligand detection by V1re2 and V 1re6. Differences in chemical structures among 
tested compounds are highlighted by coloured circles. Arrowheads mark co- 
localization between Egr1 and receptor signals. Scale bar, 100 um. 
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example, Vmn2r8/9 and Vmn2rll, activated by M. spicilegus, 
and Vmn2rl5, Vmn2rl6 and Vmn2rl7, activated by M. musculus, 
belong to clade 6 (Supplementary Fig. 11b). Likewise, Vmn2r69 (acti- 
vated by M. spicilegus) and Vmn2r66 (activated by M. musculus) 
belong to clade 3. Thus, through the activation of specialized receptors, 
M. musculus may readily discriminate scents emitted by closely related 
but reproductively incompatible species, a property that could be 
linked to the reproductive isolation of these species. 

V1Rs and V2Rs are associated with segregated neural pathways”, 
raising the possibility that fundamental functional differences may 
exist between the two families. Remarkably, our data suggest that 
V1Rs and V2Rs display different receptor properties. Nearly half of 
the V1Rs (27 out of 56) exhibit generalized activation by multiple cues 
(Fig. 3e), including signals with apparent conflicting behavioural sig- 
nificance. For example, receptors within the VIRh, VIRc and V1Re 
clades were activated by mouse, predator and non-predator cues 
(Supplementary Tables 1 and 2, Supplementary Fig. 12). In contrast, 
most V2Rs (29 out of 32) are activated by cues reflecting a unique 
ethological context such as a male, a female, or a given type of predator 
or non-predator. In addition, hierarchical clustering across all iden- 
tified receptors revealed clear segregation between V1Rs and V2Rs 
(Supplementary Fig. 5). These results suggest that VIR and V2R path- 
ways may encode different types of information: individual V2Rs 
appear uniquely poised to encode information about the identity of 
emitters with clear behavioural significance—for example, the sex of a 
conspecific or the nature (predator or competitor) of a heterospecific. 
In contrast, individual V1Rs may encode other forms of biologically 
relevant information. 

To gain further insight into the molecular logic of VI1R-mediated 
signalling, we investigated the detection of sulphated steroids, which 
are thought to account for 80% of VNO neuronal activation by female 
urine*’ (probably through V1Rs"'). Our data show that, when male mice 
were exposed to a mix of synthetic steroid sulphates, receptors from 
V1Ref and V1Rjk clades were specifically activated (Fig. 5a, b). We then 
tested individual compounds to attempt the pairing of specific steroid 
ligands with their cognate receptors. Corticosterone-21 sulphate 
(Q1570), a compound in female urine”, strongly activated Vlre2 and 
more weakly V1re6 cells (Fig. 5a, b). Both receptors were shown in 
earlier experiments to be specifically activated by female cues (Fig. 3a). 
In addition, we uncovered strong activation of V1rf3 by 17B-oestradiol 
sulphate (E1050) and V1rj2 by both E1050 and 5-androstene-3f, 17B- 
diol disulphate (A7864) (Fig. 5a), although these two receptors were not 
activated by female bedding, indicating that these steroids are not 
secreted under normal conditions. 

Thus, our approach efficiently achieved single compound resolution, 
offering the unique opportunity to test the receptor specificity to a 
variety of individual chemicals. We further tested four sulphated 
oestrogen compounds structurally related to E1050, and three additional 
sulphated pregnenes structurally related to Q1570. V1rf3 appeared 
broadly selective to oestradiols, oestriols and related stereoisomers, 
regardless of sulphate positions, but did not respond to androgens or 
glucocorticoids (Fig. 5c). Interestingly, no other V1rf receptor was acti- 
vated by these ligands. In contrast, V 11j2 was activated by androgens and 
oestradiols but not oestriols. Similarly, Vlre2 and V1re6 selectively 
detected corticosteroids (Fig. 5d). Therefore, V1R receptors can distin- 
guish distinct structural classes of steroids. Androgens, oestrogens and 
glucocorticoids are ubiquitous though sensitive reporters of the animal 
endocrine state. Our results thus suggest that V1Rs may serve as detec- 
tors of the physiological status of an animal. 

In conclusion, our data have begun to uncover the molecular logic 
by which vomeronasal receptors of different families, clades and recep- 
tor sequences extract biological information and trigger appropriate 
behavioural responses to animals of a given sex, species and physio- 
logical status. The collection of receptors uncovered in this study pro- 
vides a molecular foundation to further dissect the neural circuits 
governing social and sexual communication in rodents. 
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METHODS SUMMARY 


Stimulus exposure was conducted by introducing a subject animal (male or female 
CD-1 mice, 8 to 14 weeks old) in a fresh cage containing distinct animal cues for 
30 min (for Fig. 1) or 40 min (for Figs 2-5). The dissected VNOs were embedded in 
OCT (Tissue-Tek) and frozen in dry ice. Cryosections (161m) of VNO were 
subjected to RNA in situ hybridization using IEG and VR probes. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Sampling of animal stimuli. Bedding materials used in this study were all freshly 
sampled from cages that house live animals (Harvard University, Harvard 
Museum of Natural History, Harvard Concord Field Station, Tufts University, 
Museum of Science, Boston, and New England Wildlife Center). Soiled bedding 
represents the most complete stimulus source for animals, and is also of ecological 
relevance. Bedding materials typically absorb a wide range of chemical stimuli 
excreted by animals, such as urine, faeces, saliva, fur, and other gland secretions. 
Since different animals are housed in different environments, we flexibly adjusted 
the sampling procedures. For instance, chemosignals emitted by heterospecific 
mammals and birds (male rat, female fox, male ferret, female bobcat, male 
Peromyscus, male M. spicilegus, male and female gerbils, male and female hamsters, 
male and female rabbits, woodchuck, pigeon, red tailed hawk, screech owl, and 
great horned owl) were sampled as soiled bedding (paper, woodchips or corn cob). 
For reptiles, we sampled faeces, urate and other gland secretions absorbed in 
woodchips or paper. These bedding materials were directly used for exposure 
experiments (as described separately below). For aquatic animals such as alligators, 
only faecal pellets were sampled. For insect larvae, live animals were directly used 
for exposure experiments. Some predators such as snake and predatory birds were 
fed mice as part of their diet, and we took great care to avoid potential odour 
contamination. For example, on bedding sampling we avoided areas where mouse 
carcass was present in animal cages. Second, to sample milk snake odour, which we 
extensively used for our study, we changed bedding after the feeding to avoid 
potential odour contamination from mice. We also tested materials from multiple 
individuals whenever possible. Judging from the number of Egr1-positive cells, we 
did not find extensive individual variability in these samples. If multiple indivi- 
duals were not available, especially for bobcat, hawk and great horned owl, we 
tested stimulus samples from different collection dates. We stored these bedding 
materials at 4°C for the short term (one week) and at —20°C for the long term. 
These materials, even after long term storage at —20°C when the amount of 
volatiles was significantly reduced, did not appreciably lose their ability to robustly 
stimulate vomeronasal neurons. 

For conspecific stimuli, to represent a potential diversity of chemical cues 
emitted by different subspecies of mice, we collected bedding samples from 5 
different strains of mice: BALB/c (Jackson Labs), PWD/PhJ (Jackson Labs), 
CAST/EiJ (Jackson Labs), Idaho*! and Chuuk’', and exposed these samples as a 
mixture. It is known that mice secrete different vomeronasal cues reflecting their 
physiological states, for example, different phases of oestrous’, prompting us to 
sample materials freshly from cages that house multiple animals over 1 week. 
Thus, conspecific stimuli used in our study probably contain chemosignals 
secreted over different phases of the oestrous cycles. We stored these materials 
at 4°C for the short term and —20 °C for the long term. 

Stimulus exposure. For most exposure experiments involving bedding stimuli, 
approximately 50 ml (in volume) of bedding containing animal cues were placed 
in a clean cage. We introduced a subject mouse (male or female CD-1, from 8 
weeks to 14 weeks old, Charles River), which voluntarily made extensive direct 
contacts with introduced stimuli in freely behaving conditions. The animals were 
exposed to stimuli for 30 min (for Fig. 1) or 40 min (for Figs 2-5). Subsequently, 
the dissected VNOs were embedded in OCT (Tissue-Tek) and frozen in dry ice. 
VNO cryosections (16 1m) were used for RNA in situ hybridization using IEG and 
vomeronasal receptor probes. Control experiments were conducted using fresh 
bedding in an identical manner. For insect larvae exposure, 3-4 insect larvae were 
directly introduced to the cages. For alligator stimuli, a few faecal pellets were used. 
For heterospecific mix exposure experiments, ~100 ml mixture of the following 
bedding sample was used: Peromyscus (P. maniculatus, P. leucopus, P. polionotus), 
mammalian predators (bobcat, fox, ferret, rat), avian predators (screech owl, great 
horned owl, red tail hawk), reptiles (rat snake, milk snake, rattlesnake, boa, alligator), 
and M. spicilegus. For pure chemicals such as ESP1 and sulphated steroids, ~5 il of 
Ringer’s (in mM; 115 NaCl, 5 KCl, 2 CaCl,, 2 MgCl, 25 NaHCO; and 5 HEPES) 
containing the stimuli were directly spotted on each nostril. Recombinant ESP1 was 
purified as a GST fusion protein overexpressed in Escherichia coli using pET41 
vector (Novagen), followed by thrombin cleavage to release the ESP1 peptide. 
2 1g of the peptide was exposed to each animal. 

Sulphated steroid exposure. Steroids were purchased from Steraloids. A mix of 
steroids (A6940, epitestosterone sodium sulphate; A7864, 5-androsten-3, 17B- 
diol disulphate; E1050, 17B-oestradiol sulphate; E0893, 17c-oestradiol sulphate; 
P3817, allopregnanolone sulphate; P8200, epipregnanolone sulphate, Q1570, cor- 
ticosterone 21-sulphate; Q3470, deoxycorticosterone 21-glucoside; each at 
250 (tM in Ringer’s) were used for initial screens. Subsequently, individual steroids 
(Q1570; E1050; A7864; E0893; E0588, 17B-dihydroequilin 3-sodium sulphate; 
E1100, 17-oestradiol 3-sulphate; E2734, oestriol 17-sulphate; Q3910, hydrocor- 
tisone 21-sodium sulphate; Q2525, cortisone 21-sulphate; Q5545, 3B-hydroxy- 
5-pregnen-20-one 3-sulphate) were used at 500 1M in Ringer’s. 5 pl of steroid 


solution were spotted on each nostril of male CD-1 animals (8-14 weeks), and 
the animals were exposed to steroids for 40 min. Experiments were conducted for 
at least three animals. 

Preparation of RNA probes. For immediate early gene probes, we have cloned 
complementary DNA of Arc, c-Fos, c-Jun, Egr1, FosB and Nr4a1 in approximately 
900-base-pair (bp) segments to pCRII-TOPO or pCR4-TOPO vector (Invitrogen). 
Antisense cRNA probes were synthesized using T3, T7 or Sp6 polymerases 
(Promega) and digoxigenin (DIG) or fluorescein (FITC) labelling mix (Roche) 
from PCR templates. All IEG probes consisted of a cocktail of 2-3 probes to cover 
nearly the full length of these messenger RNAs. 

For V1R clade-specific probes, we cloned full length coding sequence of VIR 
receptors (V rab: Viral, a2, a3, a4, a5, a6, a7, a8, b1, b2, b7, b8, b9; V1re: V1rc3, c8, 
c10, c16, c28; Vird: Vird6, d9, d11, d12, d14, d22, Vmn1r167; V 1ref: V1re1, e2, e3, 
e4, e6, €7, e8, e9, €10, ell, e12, e13, Vmn1r224, fl, f2, f3, f4, fo; Virh: Virh1, h20; 
Viri: VIril, i3, i4, 15, i6, i8; V1rjk: V1rj2, j3, k1) and combined these probes to 
generate a clade-specific probe set. For V1rg receptors, ~1 kilobase (kb) 5’-UTR/ 
intron sequences of the following genes were used: Virg1, g2, 23, 4, 95, 26, 27, 28, 
g9, g10, gl, g12, VmnIr77, which were combined with Virl cDNA probe to 
generate the V1Rgl clade probe set. 

To generate clade-specific V2R probes, we cloned the first ~900 bp of annotated 

V2R receptor coding sequence (V2R clade 1: Vinn2r55; V2R clade 2: Vmn2r19, 
Vmn2r20, Vmn2r24; V2R clade 3: Vmn2r65, Vmn2r69, Vmn2r76, Vmn2r77; V2R 
clade 4: Vmn2r115; V2R clade 5: Vmn2r28, Vmn2r48; V2R clade 6: Vmn2r8, 
Vmn2rl5, Vmn2rl7, Vmn2r8s4, Vmn2r89, Vmn2r118; V2R clade 7: Vmn2r18, 
Vmn2r81, Vmn2r83, Vmn2r120; V2R clade 8; Vmn2r57 3’UTR probe, 
Vmn2r58, Vmn2r63, Vmn2r58, Vmn2r90, Vmn2r93, Vmn2r96, Vmn2r97, 
Vmn2r99, Vmn2rl102, Vmn2rl04, Vmn2rl05, Vmn2r106, Vmn2r108, 
Vmn2r110, and Vmn2r64 3'-UTR probe) and combined these probes to generate 
clade-specific probe sets. To generate cRNA probes specific to individual VIR 
genes, we cloned ~1 kb 5’-UTR intron sequence of V1R genes to pCRII vector 
(Invitrogen). To produce cRNA probes specific to individual V2R receptors, we 
cloned ~600 bp of V2R 3'-UTR segments. These RNA probes were first used to 
test mRNA expression. We found that some annotated vomeronasal receptor 
genes did not appear to be expressed, since these RNA probes gave no discernible 
signals. For all vomeronasal receptor genes, for which we could confirm the 
expression, we tested the specificity of these probes by dual colour in situ hybrid- 
ization using DIG and FITC probes and used for receptor mapping experiments. 
Probes generated in our study to detect specific receptors are listed in 
Supplementary Table 1. The VR nomenclature was based on that of GenBank 
and Mouse Genome Informatics. 
RNA in situ hybridization. Single colour RNA in situ hybridization was con- 
ducted essentially as described**. We used DIG labelled cRNA probes at 2ng yl ' 
and used a hybridization temperature of 65 °C for experiments shown in Fig. 1. For 
Egr1 in situ hybridization experiments shown in Fig. 2, we used 68°C as the 
hybridization temperature. Dual colour fluorescence in situ hybridization was 
conducted in the following steps. First, the tissue was fixed in 4% formaldehyde/ 
1X PBS for 10 min, and washed 3 times with 1X PBS for 3 min each. The tissues 
were treated with acetylation solution (0.1 M triethanolamine with 2.5 ul ml”! 
acetic anhydride) for 10 min. After 3 washes with 1X PBS, each for 5 min, the 
slide was incubated with the pre-hybridization solution (50% formamide, 5 SSC, 
5X Denhardt’s, 2.5mgml~’ yeast RNA, 0.5mgml' herring sperm DNA) 
for 2h. The hybridization buffer (4% dextran sulphate, Millipore, added to pre- 
hybridization buffer) containing FITC labelled Egr1 probes (a cocktail of three 
probes, each at 50 pg ul) and DIG labelled VR probes (at 0.5 ng pl’ for cDNA 
probes, and 1 ngul ' for 5'-UTR-intron and 3’-UTR probes) was heated at 95 °C 
for 3 min and immediately chilled on ice for 5 min. The hybridization solution was 
applied to the slides, which were covered with parafilm and incubated in a sealed 
chamber for 16h at 68 °C. Following hybridization, the slides were washed with 
5X SSC once for 5 min, and with 0.2 SSC three times, each for 20 min at 68 °C. 
Slides were washed at room temperature with 0.2 SSC for 5 min and subse- 
quently with TNT buffer (100mM Tris, pH 7.5, 150mM NaCl, 0.05% Tween 
20) for 5 min. 

After the post-hybridization washes, 200 kl of anti- FITC-POD (Roche, at 1/250 
dilution in TNB blocking buffer, Perkin-Elmer) was applied and incubated for 3 h 
at room temperature. Slides were washed with TNT buffer for a total of 1h, with 
buffer exchanges every 10 min. The signal was developed using the TSA biotin plus 
kit (Perkin Elmer), as per manufacturer’s protocol. The slides were washed with 
TNT buffer 3 times, each for 5 min, and subsequently treated with 3% H,O,/ 
1XPBS to kill residual peroxidase activity. Slides were washed again 3 times with 
1X PBS and TNT, each for 5 min. DIG antibody solution (anti-DIG-POD, Roche, 
at 1/500 dilution, and Streptavidin-Alexa488, Invitrogen, at 1/250 dilution in TNB 
buffer) were applied to the slides and incubated overnight at 4°C. After washing 
slides with TNT (6 times, 10 min each), the signal was developed using the TSA 
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Cy3 plus kit (Perkin Elmer) as per manufacturer’s protocol. Slides were washed 
with TNT (3 times, 5 min each and once for 1h), and tissues were mounted with 
Vectashield (Vector labs) containing 8 1g ml’ DAPI. All the microscopy images 
were acquired using LSM510 or AxioImager Z2 (Zeiss). 

Analysis of in situ hybridization images. For single colour in situ hybridization 
images, quantitation was conducted using a minimum of 10 VNO sections per 
animal and 3 animals (data in Fig. 1) or 3-4 animals (data in Fig. 2). Since we found 
that 0.2 mm’ represents areas occupied by medial cryostat sections of the VNO 
and contain approximately 1,000 VNO cells, we used the average number of Egr1- 
positive cells per 0.2 mm’ in Fig. 1, and we converted these numbers to percentage 
of activated neurons among total VNO neurons in Fig. 2. For dual colour in situ 
hybridization, we quantitated the co-localization of Egr1 and receptor signals over 
four sections per VNO, for a minimum of three animals. We then calculated the 
percentage of activated neurons in specific receptor neurons, for each odour class, 
and generated a co-localization matrix. In many cases, we found that individual 
receptor mapping is unnecessary when the hierarchical screen can unequivocally 
demonstrate that there are no activated neurons in specific receptor clades. In 
these cases, we input zero values to the co-localization matrix. For hierarchical 
clustering of the co-localization matrix, we used the Cluster program (http:// 
bonsai.hgc.jp/~mdehoon/software/cluster/software.htm), with average linkage 
in Euclidian distance. To generate the clustering diagram in Supplementary Fig. 4, 
we calculated the average number of receptor neurons per receptor in 12 sections 
and used this as a weight. The heat map and clustering dendrogram were generated 
using the Java Treeview program (http://jtreeview.sourceforge.net/). 

Behavioural assay. Male TrpC2 mice (+/— or —/—, 8-14 weeks old, ref. 12) were 
single-housed three days before the experiment in a manner blind to the experi- 
menter. The behaviour experiment was conducted by introducing 50 ml volume of 
fresh or rat bedding to one side of the cage, away from the nest area. The beha- 
viours of the subject mice were video recorded and total contact time as well as 
ingestive behaviour were scored by an individual blind to the genotype. We 
defined ingestive behaviour as animals engaged in ingestion while holding a food 
pellet with two forepaws. 

Generation of OMP-GCaMP3 transgenic line. pJOMP plasmid containing the rat 
olfactory marker protein (OMP) genomic sequence*’ was modified so that the 
G-CaMP3 ORF sequence completely replaces the OMP ORF. Linearized vector 
was used for pronuclear injection (performed by Harvard Genome Modification 
Facility), and transgenic founders were further backcrossed to C57B1/6 mice to estab- 
lish an OMP-GCaMP3 line. This line expresses the transgene uniformly throughout 
the vomeronasal epithelium and showed no sign of reported cell toxicity’. 
Calcium imaging on VNO slices. Calcium imaging was carried out essentially as 
described’, using 5-8-week-old female OMP-GCaMP3 mice. The VNOs were 
acutely dissected, separated from bones, and embedded in 4% low melting point 
agar in mACSF (in mM; 130 NaCl, 5 KCl, 1 MgCh, 2.5 CaCl, 1.25 NaH,POu,, 25 
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NaHCOs, 10 glucose). The coronal vibratome sections (200 |1m) were cut, and 
slices were kept in continuously oxygenated mACSF for up to 8h at 25°C. The 
flow rate of the stimulus was approximately 0.3 ml min” ', and we delivered stimu- 
lus for 40 s. All imaging was conducted at 25 °C. The fluorescence changes due to 
calcium transients were monitored using a LSM710 microscope with a GaAsP 
detector (Zeiss). We used a 1:100 dilution of freshly sampled rat urine from 2-6- 
month-old CD male rats (Charles River) in mACSF. For snake stimuli, shredded 
snake bedding (that is, paper) was extracted with mACSF, filtered and used for 
stimulation. To quantify the number of activated cells, we first generated AF 
images by subtracting an average of 20s frames corresponding to initial resting 
period from the raw images. We then created an average AF image consisting of 
10s frames corresponding to the maximum fluorescence interval (shown in 
Fig. 2c). This operation significantly reduced the fluorescence signals from spon- 
taneous activity, which is typically short (lasting 1-2 s) and consists of small bursts, 
and enriched evoked activity, which is a more sustained (more than 10s), larger 
rise in fluorescent intensity. The fluorescence traces of individual positive cells 
were further examined to confirm the sustained nature of the response. The 
number of activated cells was quantified using ImageJ. To quantify the number 
of viable cells during the imaging experiments, we counted the number of 
G-CaMP3-positive cells responsive to 40 mM KCl in mACSF. 
Electrophysiology. Loose patch recordings were performed at room temperature 
with a Multiclamp 700B (Axon Instruments). Data were recorded at 10 kHz, low 
pass filtered at 2kHz and digitized with a Digidata 1440A digitizer (Axon 
Instruments). Borosilicate glass (Sutter Instruments Co., o.d. 1.5mm, id. 
0.86 mm) patch pipettes (3-8 MQ) were pulled on a Flaming/Brown micropipette 
puller (Sutter Instrument Co.). The same mACSF was used as for the pipette 
solution. Data were acquired with pClamp and analysed in Matlab. Pneumatic 
electronic valves (Clippard Instruments) were used to control the flow of stimuli. 
Electronic valves were controlled via digital output from the Digidata 1440 A 
digitizer. The valves were opened for 20 s in every stimulated trial. For rat stimulus, 
we used 1:200 dilution of rat urine (male CD rats, Charles River, 2-6 months old) 
in mACSF. For ferret stimuli, ~50 ml volume of ferret bedding containing urine, 
faeces, fur and gland excretions was extracted with 50 ml of mACSF overnight at 
4°C, then filtered and used for experiments. 
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A natural polymorphism alters odour and DEET 
sensitivity in an insect odorant receptor 


Maurizio Pellegrino't, Nicole Steinbach'+, Marcus C. Stensmyr’, Bill S. Hansson? & Leslie B. Vosshall’? 


Blood-feeding insects such as mosquitoes are efficient vectors of 
human infectious diseases because they are strongly attracted by body 
heat, carbon dioxide and odours produced by their vertebrate hosts. 
Insect repellents containing DEET (N,N-diethyl-meta-toluamide) 
are highly effective, but the mechanism by which this chemical wards 
off biting insects remains controversial despite decades of investiga- 
tion’. DEET seems to act both at close range as a contact chemo- 
repellent, by affecting insect gustatory receptors’”, and at long range, 
by affecting the olfactory system'’. Two opposing mechanisms for 
the observed behavioural effects of DEET in the gas phase have been 
proposed: that DEET interferes with the olfactory system to block 
host odour recognition’” and that DEET actively repels insects by 
activating olfactory neurons that elicit avoidance behaviour* "'. Here 
we show that DEET functions as a modulator of the odour-gated ion 
channel formed by the insect odorant receptor complex'*'*. The 
functional insect odorant receptor complex consists of a common 
co-receptor, ORCO (ref. 15) (formerly called OR83B; ref. 16), and 
one or more variable odorant receptor subunits that confer odour 
selectivity’. DEET acts on this complex to potentiate or inhibit 
odour-evoked activity or to inhibit odour-evoked suppression of 
spontaneous activity. This modulation depends on the specific odor- 
ant receptor and the concentration and identity of the odour ligand. 
We identify a single amino-acid polymorphism in the second trans- 
membrane domain of receptor OR59B in a Drosophila melanogaster 
strain from Brazil that renders OR59B insensitive to inhibition by 
the odour ligand and modulation by DEET. Our data indicate that 
natural variation can modify the sensitivity of an odour-specific 
insect odorant receptor to odour ligands and DEET. Furthermore, 
they support the hypothesis that DEET acts asa molecular ‘confusant’ 
that scrambles the insect odour code, and provide a compelling 
explanation for the broad-spectrum efficacy of DEET against mul- 
tiple insect species. 

Previous work has shown that the odour of Drosophila food potently 
attracts adult D. melanogaster vinegar flies and that DEET blocks this 
attraction’”. The behavioural effects of DEET require an intact olfactory 
system and the olfactory co-receptor ORCO’. These results implicated 
the olfactory system in the observed behavioural effects but failed both 
to distinguish between the two competing models of action for DEET 
and to determine whether DEET acts on the odour-specific odorant 
receptors, ORCO or both. We carried out electrophysiological record- 
ings of Drosophila olfactory sensory neurons (OSNs) to test these com- 
peting possibilities. 

In response to the suggestion that DEET and odours may interact in 
the vapour phase””®, we first quantified the respective amounts of 
vapour-phase 1-octen-3-ol emitted from the stimulus pipette in the 
presence and absence of DEET, using solid-phase microextraction 
(SPME) followed by gas chromatography mass spectroscopy analysis 
(GC-MS). The SPME measurements coupled to GC-MS (Fig. 1a) 
showed that the addition of a second filter paper containing pure 


DEET in the stimulus pipette had no significant effect on the release 
of 1-octen-3-ol (10° dilution). Thus, we can rule out any fixative role 
of DEET under the conditions used here. 

We next performed extracellular recordings to measure the effect of 
DEET on responses elicited by odours in Drosophila OSNs housed 
within the ab2 (Fig. la and Supplementary Fig. 1) or ab3 (Supplemen- 
tary Fig. 2) olfactory hairs, or sensilla, on the fly antenna. Each of these 
sensilla houses two OSNs expressing different odorant receptors with 
unique odour response profiles’, We measured the activity of these 
OSNs simultaneously and compared their responses to odour with and 
without co-presentation of DEET (Fig. 1b, c). 

The effect of DEET on four OSNs stimulated with ten structurally 
diverse odours was complex and dependent on odorant receptor, 
odour and concentration. In some OSNs, DEET suppressed odour- 
mediated inhibition (Fig. 1d, f and Supplementary Fig. 1a), in others it 
decreased odour-induced activation (Fig. le, Supplementary Fig. 1b, d, 
e and Supplementary Fig. 2a—g) and in others it had no effect (Fig. 1g 
and Supplementary Figs 1c and 2h-j). Moreover, the effects of DEET 
were strongly concentration dependent, such that high odour concen- 
trations often overcame the effects of DEET (Fig. 1 and Supplementary 
Figs 1 and 2). DEET presented alone, without odour stimuli, elicited no 
response above that evoked by solvent in ab2A and ab3A neurons, 
slightly activated ab2B neurons and slightly inhibited ab3B neurons; 
but responses were considerably smaller than those elicited by cognate 
odour ligands (Supplementary Fig. 3). Therefore, DEET alone has a 
negligible effect on olfactory responses in ab2 and ab3 neurons. 

Notably, 1-octen-3-ol presented in a dilution of 10 * had opposite 
effects on the two neurons housed in ab2 sensilla, inhibiting the ab2A 
neuron expressing OR59B-ORCO (Fig. 1d) and activating the ab2B 
neuron expressing OR85A-ORCO (Fig. le). Co-application of DEET 
inverted OSN responses to odour, leading to activation of the ab2A 
neuron (Fig. 1d) and suppressing the odour-induced activation of the 
ab2B neuron (Fig. le). Similar opposite effects of DEET were observed 
when the ab2 sensillum was stimulated with a different odour, 1-octanol 
(Supplementary Fig. 1a, b). 

Taken together, our results support the hypothesis that DEET acts 
as a molecular confusant, scrambling the Drosophila odour code by 
direct modulation of odorant receptor activity dependent on the type 
of odour and its concentration (Fig. 1h). Recent work examining the 
effect of DEET on mosquito odorant receptors in heterologous cells 
supports this hypothesis’. 

Because the effects of DEET varied with the specific OSN and odour 
tested, it seems unlikely that DEET acts directly and solely on the 
conserved co-receptor ORCO, which is co-expressed in all the OSNs 
examined here. To determine whether DEET acts on the odour- 
specific odorant receptor subunit, we focused on the pharmacology of 
the OR59B-ORCO complex in ab2A OSNs. 1-octen-3-ol inhibits basal 
activity of OR59B-ORCO at low concentrations but acts as an agonist 
at high concentrations (Fig. 1d). DEET interfered with inhibition of 
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Figure 1 | DEET scrambles the Drosophila odour code. a, SPME and GC- 
MS quantitation of 10°? 1-octen-3-ol emitted from the stimulus pipette in the 
absence (cyan bar) or presence (blue bar) of pure DEET. Data represent peak 
area (NS, not significant; f-test; mean + s.e.m., n = 5). b-c, Representative 
traces of single-sensillum recordings from OR59B-ORCO in the ab2A OSN 
(red spikes) and OR85A-ORCO in the ab2B OSN (black spikes), stimulated by 
10-7 1-octen-3-ol with (b) and without (c) DEET, were recorded 
simultaneously and subsequently separated using spike-sorting algorithms. 
Bars represent 1-s odour stimulus. The delayed onset of odour response is a 
function of the odour delivery system. d-~g, Dose-response curves of OR59B- 
ORCO in ab2A (d, f, g) and OR85A-ORCO in ab2B (e), stimulated with 
increasing concentrations of 1-octen-3-ol (d, e), linalool (f) and methyl acetate 
(g) in the absence (light colour) or presence (dark colour) of DEET. Bar plots 
next to the dose-response curves represent responses to the solvent paraffin oil 
(PO) in the absence (grey bar) or presence (black bar) of DEET (**P < 0.01, 
*** PD < 0,001; F-test with Bonferroni correction; mean + s.e.m., n = 8-22). A, 
relative response (Methods); v/v, volume concentration. h, Summary of effects 
of DEET on the Drosophila ab2 and ab3 odour code derived from dose- 
response curves in d-g and Supplementary Figs 1 and 2. The significance of the 
change in response due to co-application of odorant and DEET was assessed 
using an F-test. NA, not applicable. 


ORS59B-ORCO by 1-octen-3-ol, 1-octanol and linalool, but had no 
effect on odour-dependent activation by methyl acetate and 2,3- 
butanedione (Fig. 1g and Supplementary Fig. 1c). Notably, DEET 
had no effect on the OR59B-ORCO activation seen at higher concen- 
trations of 1-octen-3-ol. This selective effect on inhibition might be 
explained by the presence on the OR59B receptor of distinct 1-octen- 
3-ol-interaction sites, a high-affinity site that inhibits the odorant 
receptor complex and is modulated by DEET and a low-affinity 
DEET-independent site that activates the odorant receptor complex. 

To investigate the mechanistic basis of OR59B modulation by 
DEET, we turned to analysis of this receptor in D. melanogaster strains 
collected around the world. Polymorphisms in natural populations 
have been previously connected to different sensitivity to odours in 
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humans’’”®, and oxygen and carbon dioxide sensing in the nematode 


Caenorhabditis elegans’’. We reasoned that naturally occurring poly- 
morphisms in insect odorant receptors might modify odorant recep- 
tor/odorant interaction sites and affect their sensitivity to DEET. To 
search for putative polymorphisms that affect DEET responses, we 
assessed responses of ab2A neurons to 1-octen-3-ol in 10 7 dilution 
in the absence or presence of DEET in 18 wild-type D. melanogaster 
strains from locations around the world, and compared these res- 
ponses with those obtained in the w’’’* laboratory control strain 
(Fig. 2a, b and Supplementary Fig. 4a). In each strain, ab2 sensilla were 
identified by the characteristic size and location of the sensilla and 
responses of the ab2A cell to its cognate ligand, methyl acetate (data 
not shown). In 17 of the 18 strains, DEET increased responses of ab2A 
neurons to 10° * 1-octen-3-ol (Fig. 2b). However, ab2A neurons in the 
Brazilian strain Boa Esperanga were not inhibited by 1-octen-3-ol at 
any concentration tested and were therefore insensitive to modulation 
by DEET (Figs 2c and 3a, b and Supplementary Fig. 4b). In addition to 
the loss of inhibition by 1-octen-3-ol, the ab2A cell in the Brazilian 
strain showed robust activation by 1-octanol and ethyl hexanoate, 
odours that normally inhibit the ab2A cell in wild-type strains. 
Inhibition by linalool was equivalent in wild-type and Boa Esperanga 
strains (Fig. 3e). Excitatory responses to methyl acetate, ethyl acetate 
and 2,3-butanedione, both in the absence and presence of DEET, did 
not differ when compared with the corresponding w‘!’* neuron 
(Fig. 3c, d and Supplementary Fig. 5; data not shown). In control 
experiments, we confirmed that the odour response profiles of ab2A 
and ab2B OSNs in the Brazilian strain are otherwise similar to that of 
our w’’’* control strain (Fig. 3f and Supplementary Fig. 5). 

We proposed that a genetic polymorphism in Or59b in the Boa 
Esperanga strain may account for the changed responses to odour 
and DEET. We therefore sequenced and compared the coding region 
of Or59b in the 19 strains with the published Or59b sequence (NCBI 
reference sequence, NP_523822.1), and found seven missense poly- 
morphisms and 36 silent polymorphisms among all strains (Sup- 
plementary Table 1 and Supplementary Fig. 6). The protein sequence 
of OR59B in Boa Esperanga is referred to as OR59B"™ and varies from 
the NCBI reference at four amino-acid residues (Val41Phe, Val91 Ala, 
Tyr376Ser and Val388Ala). Among these, two are unique to this strain: 
Val41Phe, located in the amino terminus near transmembrane 
domain 1 (TM1), and Val91Ala, located within TM2 (Fig. 4a, b and 
Supplementary Fig. 6). On the basis of our within-strain sampling, we 
detected only one protein variant per strain except for the w''’* control 
strain, for which we identified two sequences: one identical to the 
published OR59B sequence (OR59BN@®! B®"), and one containing 
two missense changes (OR59B™“**" 797°; Fig. 4a and Supplemen- 
tary Table 1). We analysed electrophysiological recordings obtained 
from the w'!’* control strain for each odour tested and found no 
evidence that the responses sort into two phenotypically sey separable 
clusters. Therefore, we assume that the OR59BNC® and 
ORS9B°* ‘°° haplotypes are functionally equivalent, at least for 
the odours tested in this study. The coding sequences of Orco in the 
w'''® and Boa Esperanca strains did not differ from the NCBI ref- 
erence (data not shown), which suggests that the protein sequence 
variations in the odour-specific subunit OR59B, rather than the co- 
receptor ORCO, eliminate inactivation by low concentrations of 
1-octen-3-ol and thereby render the odorant receptor complex 
insensitive to modulation by DEET. 

To test the functional consequences of the four OR59B missense 
changes in the Boa Esperanga strain, we generated transgenic flies 
carrying receptor variants each containing one of the four changes 
(Val41Phe, Val91Ala, Tyr376Ser or Val388Ala), a combination of 
the two unique to Boa Esperanga (Val41Phe and Val91Ala) or those 
shared with other strains (Tyr376Ser and Val388Ala), based on the 
ORS9BNC®! RFF backbone. OR59B variants were selectively expressed 
in the Drosophila Ahalo ‘empty neuron’ system’, in which the endo- 
genous odour-specific odorant receptors in ab3A OSNs were replaced 
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Figure 2 | OOR59B-ORCO sensitivity to DEET varies across wild-type D. 
melanogaster strains. a, Schematic of the screening protocol: 10” 1-octen-3- 
ol was delivered in the absence and presence of DEET. b-c, Bar plots of odour- 


with our OR59B mutants (Fig. 4c and Supplementary Fig. 7). As 
expected, 10 7 1-octen-3-ol caused inhibition of ab3A neurons 
expressing OR59BN©®! BFF and activation of ab3A neurons expres- 
sing OR59BP™ (Fig. 4c). Whereas OR59B"*”°*, ORS9BY***“ and 
OR59B1°764 V3884 ch owed normal inhibition to this odour, any variant 
of OR59B containing the Val91Ala change showed a loss of odour 
inhibition by 1-octen-3-ol and insensitivity to DEET (Fig. 4c). This 
demonstrates that the Val91 Ala change is sufficient to phenocopy the 
electrophysiological properties of the endogenous Boa Esperanga 
ORS59B (Fig. 4c). It has previously been shown that responses of 
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Figure 3 | OR59B-ORCO neurons in the Boa Esperanca strain are 
insensitive to modulation by DEET. a—d, Dose-response curves of the 
OR59B-ORCO ab2A OSN in wild-type w’7’8 (solid line) and Boa Esperanca 
(dashed line) strains stimulated with increasing concentrations of 1-octen-3-ol 
(a, b) or methyl acetate (c, d), with (b, d) or without (a, c) DEET (F-test with 
Bonferroni correction; mean + s.e.m., n = 5-14). The dose-response curve of 

w'!?8 to 1-octen-3-ol in a and b is reproduced from Fig. 1d for comparison. Bar 
Le next to the dose-response curves represent responses to the solvent 
paraffin oil in the absence (grey bar) or presence (black bar) of DEET (F-test 
with Bonferroni correction; mean + s.e.m., 1 = 5-11). e, f, Bar plots comparing 
responses of OR59B-ORCO in ab2A (e) and OR85A-ORCO in ab2B (f) in 
w"448 (solid bar) and Boa Esperanga (dashed bar) strains to 10” 1-octen-3-ol, 
10 1 1-octanol, 107+ ethyl hexanoate and 10 ' linalool (t-test with Bonferroni 
correction; mean + s.e.m., 1 = 9-11). 


evoked responses of the w! 118 strain (b) and 18 wild-type strains (c) to 10°? 


1-octen-3-ol in the absence (light blue) or presence (dark blue) of DEET (t-test 
with Bonferroni correction; mean + s.e.m., m = 10-17). 


OR59B expressed in the empty neuron faithfully recapitulate receptor 
function measured in the endogenous ab2A neuron’*. We therefore 
assume that a strain carrying only the OR59BY’'“ polymorphism 
would have the same phenotype as Boa Esperanga. 

DEET shows behavioural efficacy in insects as diverse as 
Drosophila®’ and mosquitoes’ ***""'. We have shown that a single, 
naturally occurring polymorphism in an odour-specific odorant 
receptor can modify receptor interactions with an inhibitory odour 
and render the receptor insensitive to modulation by DEET. These 
results provide compelling evidence that DEET interacts directly with 
an odour-specific odorant receptor. Consistent with this, recent work 
showed that an odour-specific OR subunit is required for the beha- 
vioural effects of DEET on mosquito larvae’’. Our data imply a com- 
plexity in ligand-binding interactions within a single insect odorant 
receptor complex that bears further investigation. The Val91 Ala poly- 
morphism is located in the second predicted transmembrane domain 
but little is known about which domains of this novel class of odour- 
gated ion channels contribute to ligand binding or ion channel func- 
tion’’”*. A recent study implicated the third predicted transmembrane 
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Figure 4 | A single natural polymorphism in OR59B confers insensitivity to 
DEET. a, Haplotype network of OR59B protein variants. Each circle represents 
a unique ORS59B protein variant, its size proportional to the number of strains 
containing each variant. Connecting lines show the amino-acid substitutions 
that separate each variant. The bold circle represents the OR59BN@™ 8¥* variant 
with NCBI accession code NP_5238822.1. The Boa Esperanga strain is shown 
in red. b, Snake plot of OR59B showing the location of missense 
polymorphisms. Changes that differentiate Boa Esperan¢a from the NCBI 
reference are shown in red. c, Bar plots show the responses of Or59b variants 
ectopically expressed in ab3A neurons lacking endogenous OR22A and OR22B 
to 10 * 1-octen-3-ol in the absence (light blue) or presence (dark blue) of 
DEET. The locations of variant amino acids in OR59B are depicted in the 
cartoon snake plot on top of each set of bar graphs (t-test with Bonferroni 
correction; mean + s.e.m., m = 7-11). 
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domain of an insect odorant receptor in ligand interactions™, and 
additional structure-function work of this nature will ultimately reveal 
how these membrane proteins interact with odorants and modulators 
including DEET. Although Val and Ala are both amino acids with 
small aliphatic side chains, Val—Ala substitutions have been shown 
to affect other cation channels**. It therefore is plausible that this 
change would affect the function of the odour-gated ion channel subunit 
encoded by OR59B. We speculate that the Val91Ala polymorphism 
inactivates a high-affinity binding site for 1-octen-3-ol that locks the 
receptor into a closed configuration at low odour concentration. A 
separate site on the receptor would have a low-affinity binding site that 
would lead to activation. In this model, DEET would selectively interfere 
with the high-affinity binding site. Future investigation of the structure— 
function relation of this receptor is needed to test these ideas. Genetic 
insensitivity to DEET has previously been shown to exist in both 
Drosophila flies’ and Aedes aegypti mosquitoes” but the genes respons- 
ible remain unknown. It will be interesting to investigate whether accu- 
mulated odorant receptor polymorphisms contribute to these 
phenotypes. 

It has recently been proposed that DEET directly activates beha- 
vioural repulsion through the activation of odorant receptors that medi- 
ate avoidance behaviours*'°. The insect odorant receptor repertoire is 
highly diverse with very low protein similarity across insect species****. 
Furthermore, different species respond very selectively to host odour 
cues that meet disparate ecological needs**”*. It seems unlikely that a 
single molecule like DEET would activate a different yet similarly potent 
repulsive behaviour in all insects tested. Instead, our data support the 
hypothesis that DEET is a broad-selectivity insect odorant receptor 
modulator that alters the fine-tuning of the insect olfactory system. 
DEET-mediated scrambling of the odour code would interfere with 
behavioural responses as diverse as mosquitoes orienting to host odours 
produced by humans” and the attraction of Drosophila to yeast on 
rotting fruit”. 


METHODS SUMMARY 

Fly strains and molecular biology. D. melanogaster stocks were maintained on 
conventional cornmeal-agar-molasses medium in a 12-h-light, 12-h-dark cycle at 
25 °C. Details of molecular biology manipulations, all primers and fly strains are in 
Methods. 

Single-sensillum extracellular recordings. Recordings of female fly antennae 
were performed as described previously’ and are detailed in Methods. The respec- 
tive amounts of 1-octen-3-ol emitted from the stimulus pipettes with and without 
DEET was investigated through SPME and linked GC-MS analysis as detailed in 
Methods. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Genomic DNA. DNA was prepared according to the Quick Fly Genomic DNA 
Prep protocol from the Berkeley Drosophila Genome Project (http://www. fruitfly. 
org/about/methods/inverse.pcr.html). DNA (1.5 pl) was used for amplification 
using the KOD PCR Kit (Novagen). For Or59b, primers were designed to anneal 
to the 5’ and 3’ untranslated regions of the w’’’* Or59b locus: 5'-gaattcTCCGGG 
TATAAAGTGCAGGTGCTGGCACCG-3’ (forward); 5'-ctegagGCTCTTTTTT 
GCGGGGGCTCATGGGTGCAG-3’ (reverse). 

Orco was amplified using primers that amplify the complete coding region: 
5'-gaattcATGACAACCTCGATGCAG-3’ (forward); 5’-caattgCTTGAGCTGCA 
CCAGCACCA-3’ (reverse). 

PCR products were cloned into pGEM-T Easy (Promega Corporation,), 
sequenced (GENEWIZ, Inc.) and analysed using SeqMan software (DNASTAR, 
Inc.). For each strain, at least four independent samples were analysed, derived 
from at least two different genomic preparations and two different PCR reactions. 
These were sequenced and compared to NCBI reference sequences for each gene 
(Or59b: NM_079098.1; Orco: NM_079511.4). 

Complementary DNA preparation and transgenic flies. Total RNA was 
extracted from w’‘’* and Boa Esperanga antennae using the RNeasy Mini Kit 
(QIAGEN). 

Complementary DNA (cDNA) synthesis was performed according to the 
SuperScript III First-Strand Synthesis System for RT-PCR (Invitrogen) using 
oligo(dT) primers. Or59b cDNA from both w’’’* and Boa Esperanga was amp- 
lified using these gene-specific primers: 5'-gaattcATGGCGGTGTTCAAGCT 
AATCAAACCG-3’ (forward); 5'-ctegagTTACTGGAACTGCTCGGCCAGATT 
CA-3’ (reverse). 

PCR products representing full-length w!/® Ors9bNC®! BEF and Ors9b% 
cDNAs were cloned into pGEM-T Easy, completely sequenced and subcloned 
into the pUAST attB vector*' using EcoRI and Xhol restriction sites. 

Single point mutations were introduced into the w/* Ors9bNC™! 8®¥ cDNA by 
directed PCR mutagenesis. Two independent reactions were prepared: one con- 
tained the forward primer with the desired mutation and the reverse SP6 vector 
primer (5'-ATTTAGGTGACACTATAG-3’). The second contained the reverse 
mutating primer and the forward T7 vector primer (5’-TAATACGACTCAC 
TATAGGG-3’). PCR products from the reactions were purified and 1 pl of each 
was used as a template and mixed in a second round of amplification with T7 and 
SP6 primers to obtain the full gene. For each mutagenesis, the final PCR product 
was purified and subcloned in pGEM-T Easy, and the complete Or59b cDNA 
carrying the induced mutations was sequenced for verification and compared with 
the Or59bN! RF sequence. 

The double mutants Or59b"""" Y?!4 and Or59b"”* V4 were generated using 
ors9b'""" or Or59b"”* as a template and a second round of mutagenesis was 
implemented with the corresponding primers. 

The following primers were used. Or59b’“""; 5'-CCGCCGAAGGAGGGATT 
CCTGCGCTACGTGT-3’ (forward); 5’-ACACGTAGCGCAGGAATCCCTCC 
TTCGGCGG-3’ (reverse). Or59b’?'“: 5'-AGGTGTGCATCAATGCGTATGGC 
GCCTCGG -3' (forward); 5’-CCGAGGCGCCATACGCATTGATGCACACCT 
-3' (reverse). Or59b'*”S: 5'-TGAACAGCAACATAAGCGTGGCCAAGTTC 
GC-3' (forward); 5’-GCGAACTTGGCCACGCTTATGTTGCTGTTCA-3’ (reverse). 
ors9b’**4; 5'-GCATCATTACAATAGCGCGACAAATGAATCT-3’ (forward); 
5'-AGATTCATTTGTCGCGCTATTGTAATGATGC-3’ (reverse). 

Transgenic animals were generated in the w’/”® genetic background (Genetic 
Services, Inc.) using the phiC31-based integration system”' targeted at the attP2- 
docking site on chromosome II (ref. 32). 

Fly stocks. Drosophila melanogaster stocks were maintained on conventional 
cornmeal—agar—molasses medium in a 12-h-light, 12-h-dark cycle at 25 °C. The 
w'''S strain was used as wild-type control. 

The following wild-type strains were used: Akayu [Drosophila Genetic Resource 
Center (DGRC) #103389; origin, Japan]; Algeria (isogenic for II and III chromo- 
somes, DGRC #103390; origin, Algeria); Alma-Ata (DGRC #103391; origin, 
Kazakhstan); Canton-S (isogenic for II and III, lab stock; origin, Ohio, USA); 
CA1 (Bloomington Drosophila Stock Center #3846; origin, Cape Town, South 
Africa); Coffs Harbour (DGRC #103411; origin, New South Wales, Australia); 
Kericho-7B (DGRC #103428; origin, Kericho, Kenya); Manago (isogenic for II 
and III, DGRC #103433; origin, Hawaii, USA); Oregon-R (isogenic for II and III, 
lab stock; origin, Oregon, USA); San Miguel (isogenic for II and III], DGRC 
#103450; origin, Buenos Aires, Argentina); WT Berlin (isogenic for II and III, 
Heisenberg laboratory, Wurzburg, Germany; origin, Berlin, Germany); Batumi-L 
(DGRC #103396; origin, Batumi, Georgia); Boa Esperanca (DGRC #103400; origin, 
Minas Gerais, Brazil); BOG2 (Bloomington #3842; origin, Bogota, Colombia); CO3 
(Bloomington #3848; origin, Commack, New York, USA); EV (Bloomington 
#3851; origin, Ellenville, New York, USA); Medvast-21 (DGRC #103435; origin, 
Finland); VAG 2 (Bloomington #3876; origin, Athens, Greece). 
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The following mutant alleles and transgenic flies were used: Or22a/b“""” (ref. 
33) and Or22a-Gal4 (ref. 34). The genotypes of the flies used for Fig. 4c and 
Supplementary Fig. 8 are as follows: Or22a/b4"”;Or22a-Gal4/UAS-Or59b 
(labelled Or59bNC#! REF in the figure), Or22a/b4"*”;Or22a-Gal4/UAS-Or59b' 
(V41F), Or22a/b“"*”";Or22a-Gal4/UAS-Or59b’"'4 (V91A), Or22a/b“""”;Or22a-Gal4/ 
UAS-Ors9B'"" YA (V41F V9LA), Or22a/b""*”;Or22a-Gal4/UAS-Or59b 7° 
(1376S), Or22a/b4"*”;0r22a-Gal4/UAS-Or59b**** (388A), Or22a/b“"*”;0r22a- 
Gal4/UAS-Or59b"” 4 (T3768. V388A) and Or22a/b"""”;Or22a-Gal4/UAS- 
Or59bV 418 VIA 1978S V38EA (WATE V9LA T3768 V388A). 

SPME quantification of emitted volatiles. The effect of DEET on the amount of 
1-octen-3-ol emitted from the stimulus pipettes was investigated through SPME and 
linked GC-MS analysis. Stimulus pipettes, prepared as per the electrophysiology 
experiments, were loaded either with one filter strip impregnated with 5 ul of 
1-octen-3-ol (10) and with a second strip containing 5 ll of paraffin oil, or with 
the second strip impregnated with 5 ul of pure DEET. The pipettes were connected 
to a stimulus controller (Syntech CS 55; www.syntech.nl) and volatiles emitted from 
the pipettes during ten puffs, of 2-s duration each, delivered with 1-s intervals, were 
trapped on a SPME fibre (Supelco blue fibre; 57310-U; polydimethylsiloxane/divi- 
nylbenzene, 65-|1m coating; http://www.sigmaaldrich.com), inserted 2 cm into the 
pipette tip. After completion of the stimulus cycle, the SPME fibres were immediately 
retracted and injected into a GC-MS device for quantification. This device (Agilent 
GC6890N fitted with MS5975B unit; http://www.agilent.com) was equipped with a 
HP5-MS column (Agilent Technologies) and operated as follows. The inlet temper- 
ature was set to 250°C. Desorption time was 1 min. The temperature of the gas 
chromatography oven was held at 70°C for 2min and then increased by 
20°C min | to 280 °C, with the final temperature held for 2 min. For mass spectro- 
scopy, the transfer line was held at 280°C, the source at 230°C and the quad at 
150 °C. Mass spectra were taken in EI mode (at 70 eV) in the range from 33m/z to 
350m/z, with a scanning rate of 4.42 scans per second. GC-MS data were processed 
with the MDS-CHEMSTATION software (Agilent Technologies), and peak areas 
were autointegrated. Five replicates were collected for each condition and data were 
plotted as mean ~ s.e.m. Statistical significance was assessed using a t-test. 
Electrophysiology and odorants. Female transgenic flies were recorded at 5d 
after adult eclosion. All other flies were recorded at 5-10d after adult eclosion. 
Single-sensillum recordings were performed as described previously**”*. For each 
experiment in which we recorded OR59B variants expressed in the ab3A neuron, 
we verified that responses of endogenous OR59B in the native ab2A neuron 
showed normal inhibition by 10 * 1-octen-3-ol (data not shown). Odorants were 
obtained from Sigma-Aldrich at high purity and diluted (v/v) in paraffin oil as 
indicated. DEET was obtained from Alfa Aesar and was applied undiluted. 
Chemical Abstracts Service (CAS) numbers are as follows: paraffin oil (8012- 
95-1); 1-octen-3-ol (3391-86-4); pentanal (110-62-3); pentanoic acid (109-52-4); 
2-heptanone (110-43-0); 1-octanol (111-87-5); (—)linalool (126-91-0); methyl 
acetate (79-20-9); 2,3-butanedione (431-03-8); ethyl hexanoate (123-66-0); butyr- 
aldehyde (123-72-8); ethyl-3-hydroxybutyrate (5405-41-4); ethyl acetate (141-78- 
6); hexanol (111-27-3); DEET (134-62-3). 

The desired odour dilution (30 11) was pipetted onto a filter paper strip (3 mm 
X 50 mm) and 30 il of undiluted DEET or paraffin oil solvent was pipetted onto a 
second filter paper strip. Both filter paper strips were then carefully inserted into a 
glass Pasteur pipette. Before any recordings, charcoal-filtered air was forced 
through the pipette for 1-3 s to remove dead space in the odour delivery system. 
For actual recordings, charcoal-filtered air was continuously applied to the insect 
antenna, with odour delivered through the pipette to the fly antennae for 1 s. Each 
pipette was used at most three times and no more than three sensilla were tested 
per animal. Sensilla types were identified by size, location on the antenna and 
responsiveness to known preferred odorants”. 

Data were collected using AUTOSPIKE (Syntech) and analysed by custom spike- 
sorting algorithms*. Responses were initially classified as excitatory or inhibitory by 
visual inspection of the responses after odour application. An odour was classified as 
excitatory if it increased the spontaneous firing rate and inhibitory if it decreased the 
spontaneous firing rate. The data were then analysed by subtracting average spon- 
taneous activity (expressed as spikes per second) in the 15 s before odour application 
from activity either in the first 600 ms after odour delivery, for excitatory odorants, 
or in the first 1s, for inhibitory odorants. This value is referred to as A, and will 
typically have a negative value for inhibitory odorants and a positive value for 
excitatory odorants. The onset of odour-evoked responses varied owing to slight 
variations in the position of the odour delivery system relative to the sensillum being 
recorded. To correct for this, we calibrated the inferred odour onset on the basis of 
excitatory responses elicited by control stimuli applied at the beginning of each trial 
(ab2, 10° methyl acetate; ab3, 10° 2-heptanone). 

Statistical analysis. Dose-response curves were fitted with ORIGINPRO 8 
(OriginLab) using a logistic function, except for responses to 1-octen-3-ol in 
Fig. 1d, which used a biphasic function. 
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Comparisons of paired dose-response curves in Figs 1 and 3 and Sup- 
plementary Figs 1, 2 and 4 used an F-test to assess the statistical significance of 
differences between the two curve fits. A two-tailed t-test was performed for all 
comparisons in Fig. li (non-paired), Figs 2-4 and Supplementary Figs 3, 4 and 7 
(paired). Type I errors were addressed by using a Bonferroni correction for mul- 
tiple comparisons applied to each set of experiments. Data in Supplementary Fig. 6 
were fitted using a linear regression analysis. 

The OR59B snake plots in Fig. 4 and Supplementary Fig. 7 were hand- 
composed on the basis of transmembrane domain predictions generated with 
the PredictProtein algorithm”’. 
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The crystal structure of dynamin 


Marijn G. J. Ford', Simon Jenni” & Jodi Nunnari! 


Dynamin-related proteins (DRPs) are multi-domain GTPases that function via oligomerization and GTP-dependent 
conformational changes to play central roles in regulating membrane structure across phylogenetic kingdoms. How 
DRPs harness self-assembly and GTP-dependent conformational changes to remodel membranes is not understood. 
Here we present the crystal structure of an assembly-deficient mammalian endocytic DRP, dynamin 1, lacking the 
proline-rich domain, in its nucleotide-free state. The dynamin 1 monomer is an extended structure with the GTPase 
domain and bundle signalling element positioned on top of a long helical stalk with the pleckstrin homology domain 
flexibly attached on its opposing end. Dynamin 1 dimer and higher order dimer multimers form via interfaces located in 
the stalk. Analysis of these interfaces provides insight into DRP family member specificity and regulation and provides a 
framework for understanding the biogenesis of higher order DRP structures and the mechanism of DRP-mediated 


membrane scission events. 


Dynamin-related proteins (DRPs) belong to a highly conserved (Sup- 
plementary Fig. 1) GTPase superfamily that catalyses diverse mem- 
brane remodelling events'*. Membrane scission DRPs include dynamin 
1, which catalyses clathrin-coated vesicle scission at the plasma mem- 
brane and Drp1/Dnm1, which divide mitochondria. Despite their 
functional diversity, all DRPs undergo GTP cycle-dependent con- 
formational changes to regulate self-assembly and disassembly**. 
DRP architecture includes an amino-terminal GTPase domain, a 
bundle signalling element (BSE), a middle domain (MD) and a 
GTPase effector domain (GED)*"°. Many DRPs also have a variable 
region between the MD and GED; in dynamin 1, this is a pleckstrin 
homology domain (PH) that binds to phosphatidylinositol-4,5-bispho- 
sphate (PtdIns-4,5-P)-containing membranes to facilitate targeting 
and possibly membrane remodelling via membrane insertion'*™*. 
Assembled DRPs can form helical structures in vitro’*'®. Within these 
structures, GTP cycle-driven conformational changes result in mem- 
brane remodelling**”"”. The structural basis for DRP self-assembly and 
GTP cycle-dependent conformational changes are not fully understood. 
The mechanism of assembly has been informed by the structure of the 
‘stalk’ of MxA, a distantly related DRP, in which the MD and part of the 
GED form an extended helical bundle that mediates self-assembly via 
conserved interfaces'®. Several structures have also provided insight into 
DRP GTP cycle conformational changes (Supplementary Fig. 2)'?~*. 
Cryo-electron microscopic structures of assembled dynamin in guano- 
sine-5’-[(B,y)-methyleno]triphosphate (GMPPCP)-bound and nucleo- 
tide-free states have provided models for the assembled oligomers and 
the location of the GTPase and PH domains within the helical struc- 
tures'”**. However, to understand the basis of DRP self-assembly and 
mechanism, the architecture of DRP domains within a single molecule 
must be elucidated. Here, we report the crystal structure of an assembly- 
deficient dynamin 1 in the nucleotide-free state that lacks only the 
unstructured carboxy-terminal proline-rich domain (PRD). 


G397D, an assembly-deficient dynamin 1 mutant 


The propensity of DRPs to assemble has been an obstacle to obtaining 
crystals suitable for diffraction experiments. We identified the muta- 
tion G436D, an invariant MD residue, in the Saccharomyces cerevisiae 
mitochondrial division DRP Dnm1, by screening for mutations that 
possessed the same phenotype as assembly-deficient Dnm1 G385D, 


specifically mutations that shifted Dnm1-green fluorescent protein 
(GFP) fusion from a punctate to a more diffuse localization pattern in 
yeast cells’®. We expressed and purified (Supplementary Fig. 3a) the 
orthologous rat dynamin 1 mutant protein lacking the PRD (Dyn1 
G397D APRD) and examined its ability to self assemble using light 
scattering (Fig. 1a). Addition of GMPPCP caused an increase in scatter- 
ing in Dyn1 APRD samples, but caused no change in scattering in Dyn1 
G397D APRD samples. Using a combination of size-exclusion chro- 
matography and sucrose gradient centrifugation’*®, Dynl APRD and 
Dyn1 G397D APRD were estimated to be dimeric (Supplementary 
Fig. 3b and Supplementary Table 1) under non-assembly conditions, 
similar to other assembly-deficient DRP mutants'*”’”. Under assembly 
conditions, we observed an increase in the sedimentation coefficient of 
Dyn1 APRD, but not for Dyn1 G397D APRD (Supplementary Table 1). 
Dyn1 G397D APRD was defective for assembly-stimulated GTP hydro- 
lysis (Fig. 1b and Supplementary Table 2) and failed to assemble into 
helical structures on phosphatidylinositol 4-phosphate (PtdIns-4P)- 
containing lipid nanotubes, in contrast to Dyn] APRD (Fig. Ic). 
Dyn1 G397D APRD was also soluble at higher concentrations than 
Dyn1 APRD. Together, these data indicate that the G397D mutation 
severely hampers dynamin self assembly and substantiates the critical 
role of the MD in intermolecular interactions'®'*’””. Given these 
characteristics, it presented an attractive target for crystallization. 


Crystallization and structure of Dynl G397D APRD 


We obtained orthorhombic crystals of Dyn1 G397D APRD that dif- 
fracted to a minimum Bragg spacing of 3.1A (Methods and 
Supplementary Table 3). The structure was solved by molecular 
replacement, using the nucleotide-free rat dynamin 1 GTPase 
domain, the human dynamin 1 PH and a portion of the human 
MxA stalk, as sequential search models (Methods)'"'!*"*°°. We traced 
the complete model, except for some disordered loops, and assigned 
amino acids after refinement of the molecular replacement solution. 
The model was refined to R/R¢rce values of 0.21/0.27. A representative 
example of the B-sharpened likelihood-weighted 2mF ps — DFcaic 
electron density map is shown in Fig. 2c. 

Dyn1 G397D APRD forms an extended structure with the GTPase 
and PH domains separated by a stalk consisting of an anti-parallel 
helical bundle of the MD and a helix from the GED (Fig. 2a, b). By 
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Figure 1 | The G397D mutation in Dyn1 APRD blocks self-assembly. a, 90° 
light scattering. Dynl APRD (blue, 1 1M ) and Dyn1 G397D APRD (red, 11M) 
were monitored after addition of 0.5 mM GMPPCP and, in the case of Dyn1 
APRD, 1 mM GTP (blue arrows). Red arrow indicates opening of sample port. 
AU, arbitrary units. b, Steady-state GTP hydrolysis kinetics of Dynl APRD and 
Dyn1 G397D APRD in the absence (light blue and red) and presence (dark blue 
and orange) of 0.1 mgml ' liposomes containing 10% PtdIns-4P, monitored 
by a NADH-dependent coupled assay as described’. A representative trace is 
shown with 1 mM GTP. c, Transmission electron microscopy of negative- 
stained 0.25mgml | 10% PtdIns-4P lipid nanotubes with Dyn] APRD 
(middle) and Dyn1 G397D APRD (right) and no added protein (left) and 

0.5 mM GMPPCP. Scale bars, 200 nm. 


contrast, the distantly related bacterial dynamin-like protein (BDLP) 
is compactly folded in its nucleotide-free and GDP-bound states (Sup- 
plementary Fig. 2a), with its ‘paddle’, in an analogous region to the PH 
domain of dynamin, in close proximity to the GTPase domain’””’. Thus, 
unlike BDLP, GTP binding in dynamin is not harnessed to form an 
extended structure, consistent with nucleotide-independent assembly of 
dynamin on liposomes***'?. Linkers connecting the MD to the PH and 
the PH to the GED are disordered, indicating a flexibly tethered PH. 
Thus, any of three crystallographic symmetry-related PH domains 
could connect to the remainder of the structure (Fig. 2d). The probable 
PH partner, based on the fit with the envelope of an assembly-deficient 
dynamin dimer determined by small-angle X-ray scattering*’ (SAXS), is 
shown in Fig. 2b. The structure of the PH is similar to those previously 
determined'’”’, with expected differences concentrated in the variable 
loops. 

The structure of the GTPase domain is similar to that of the previ- 
ously determined nucleotide-free dynamin GTPase domain struc- 
ture*’. There are minor expected changes in the poorly resolved 
switch 2 region, the dynamin-specific loop and in the loop connecting 
the Nerpase to the GTPase domain. The Cgrpase helix is kinked at the 
conserved proline 294 and, together with the Norpase helix and a helix 
at the C terminus of the GED (Cggp), forms the three-helix BSE’. 
Corp covers a groove of 937 A? between the Norpase and Corpase 
helices. In the nucleotide-free dynamin 1 GTPase and the GDP- 
bound Dictyostelium discoideum dynamin A GTPase structures, a 
myosin peptide substitutes for Copp, indicating the importance of 
this interface for GTPase domain stability****. 

A linker with elevated crystallographic temperature factors con- 
nects the Ce7pase to the dynamin stalk (Fig. 2a, b) and contains two 
prolines (319 and 322), whose equivalents in the distantly related DRP 
atlastin 1 connect the GTPase and MD (Supplementary Fig. 2b). 
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Figure 2 | The crystal structure of Dyn1 G397D APRD. a, Schematic of the 
Dyn1 domain structure. Numbers indicate domain-ending amino acid. Colour 
scheme used here is retained. b, Crystal structure of Dyn1 G397D APRD. 
Linkers between the Nerpase and GTPase domain and the Cerpase and MD are 
shown in grey. Loops with no density are represented with dashed lines. Stalk 
nomenclature is based on that of the MxA stalk structure’’. VL, variable loop. 
c, An example of the refined to B-sharpened 2mF,p5 - DF-aic map, contoured at 
lo. The region shown is part of the stalk boxed by purple dotted lines in 

(b). d, Three symmetry-related PH domains in the lattice. 


Although the dynamin stalk sequence shares limited identity to 
MxA, its overall structure is similar and the MxA nomenclature is 
retained'* (Supplementary Fig. 4a). As for MxA, helix «1 is split by a 
disordered loop, L1. The remainder of «1 diverges from MxA and is 
split into two helices, termed «1°' and «1°. Helix «1° connects to 
helix «2 via a short disordered loop, L2. Helices «2 and «3 run the 
length of the stalk and are joined by a short loop, L3. The stalk is 
completed by GED-derived helix «4 that spans the stalk and connects 
to Cgep Via a linker. Following helix «3 is a coil that folds across «4 
that is strongly conserved in dynamins, Drp1s and Dnm1, but absent 
in MxA. 


The dynamin-dimer interface 
The crystal lattice contains linear filaments of dynamin assembled via 
three stalk interfaces, similar to MxA”, resulting in layers of interact- 
ing stalks separated by GTPase and PH domains (Fig. 3a, b). Interface 
2, the largest with a buried surface of 1,339 A’, has two-fold symmetry 
and is formed by residues from stalk helices «4 and «3 (Fig. 3c), with 
an additional residue from «1°! (H367). Each protomer in the inter- 
face contributes seven direct hydrogen bonds and eight hydrophobic 
residues line the site of contact between 03 and «4 (Fig. 3c). 
Interface 2 sequence conservation indicates a mechanism for dimer 
specificity within the DRP superfamily. Phylogenetic analysis of the 
hydrogen bonding partners within this region in dynamins, Drp1s 
and Dnm1, and Mx proteins allowed us to categorize three classes of 
residues: conserved in dynamins, Drp1/Dnm1, and Mx proteins (in 
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Figure 3 | Dynl G397D APRD stalk interfaces mediate self-assembly. 

a, Schematic diagram of four monomers showing interfaces in the crystal lattice. 
b, Surface representation showing the locations of interfaces 1 and 2. c, Detail of 
interface 2. Protomers are shown in lighter and darker hues. Green dotted lines 
are hydrogen bonds. d, Conservation of interface 2 residues involved in 
hydrogen bonding in dyamins, Drp1/Dnm1 and Mx proteins. Blue, conserved 
in dynamins, Drp1/Dnm1 and Mx proteins; yellow, conserved in dynamins and 


blue), conserved only within dynamins and Drp1/Dnm1 (in yellow), 
and conserved only in the dynamins (in red) (Fig. 3d). Plotting these 
classes onto a surface representation of the dynamin monomer 
(Fig. 3e) indicates that dimerization specificity is controlled by a 
spatial combinatorial code. As predicted from this code, hetero-oli- 
gomers consisting of dynamin 1 and dynamin 2 are observed”. 
Several residues contributing to the apparent specificity are localized 
in the strongly conserved coil of dynamin, Drp1] and Dnm1 that 
follows «3. 


Dynamin stalk interfaces drive self assembly 

Interfaces 1 and 3 are also stalk-localized and mediate higher order 
assembly of dynamin dimers (Fig. 3a). Interface 1 is at the tips of 
interacting stalks, proximal to the GTPase domain and BSE (Fig. 3f) 
and is formed through interactions between protomer helices «4 
and «1, The interface is capped by a stacking interaction between 
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Interface 2 
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Drps; red, conserved in dynamins. Alignment shows a subset of sequences used 
to determine the conservation. Sequences are identical to Supplementary Fig. 1, 
with the addition of Rattus norvegicus Mx1 (P18588.1) and Homo sapiens Mx1 
(P20591.4). Fruitfly, Drosophila melanogaster; human, Homo sapiens; rat, 
Rattus norvegicus; worm, Caenorhabditis elegans; yeast, Saccharomyces 
cerevisiae. e, Surface representation of conservation data shown in 

(d). f, Interface 1. Density interpreted as a PEG400 molecule is shown in black. 


opposing and flexible Y706 residues, conserved in dynamins and 
Drpls. Interface 1 includes four hydrogen bonds (Fig. 3f, dashed 
green lines), with the remainder consisting of hydrophobic residues. 
The buried surface area of interface 1 is relatively small (624 A?) andis 
likely to be dynamic to tolerate protomer interactions in a range of 
orientations. 

Interface 3 is at the distal end of the stalk, where L2 loops from 
symmetry-related molecules are in close proximity (Supplementary 
Fig. 4b). The G397D mutation is within L2, which could not be traced 
due to poor density. The N-terminal region of L2 is also in close 
proximity to a symmetry-related L1, which features a highly con- 
served glycine, G346, that when mutated in Dnm1 (G385D) blocks 
assembly beyond the dimer’®. We therefore predict that G397 and 
G346 are near one another in assembled dynamin. In addition, muta- 
tions at two dynamin arginine residues, R399 (in L2) and R361 (in 
1°", following L1), and the corresponding interface 3 mutations in 
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MxA also inhibit assembly and stabilize dimeric forms'*”’. In con- 
trast, disruption of MxA interface 1 yields a mixture of dimers and 
tetramers!®. Thus, our data indicate that interface 3 is critical at an 
early step in the dynamin assembly pathway as its disruption stabilizes 
dimeric dynamin and allows linear filament formation. In contrast, 
interface 1 interactions probably function only to stabilize oligomer- 
ization. 

From our structure, we propose that the dynamin dimer interface 2 
is constitutive and relatively rigid. As compared to linear arrangement 
in the crystal, in helical assemblies we propose that the necessary 
rotational and translational shifts occur between adjacent dimers at 
interfaces 1 and 3 (Fig. 4a). Indeed, stalk dimers fit into the GMPPCP- 
bound helical electron microscopic reconstruction of dynamin” 
(Fig. 4b) possess a more tightly packed interface 3 and interface 1. 
These differences can be attributed to the disordered interface 3 in the 
crystal that is probably due to steric hindrance from the G397D 
mutation, and the dynamic nature of the hydrophobic interface 1. 


Regulation of DRPs 


The dynamin PH is essential for endocytosis and interacts with inositol 
phospholipids with low affinity’***. Centronuclear myopathy (CNM) 
disease mutations cluster at the C terminus of the PH a-helix (Sup- 
plementary Fig. 5a), underscoring its importance. They cause an 
increase in basal GTPase activity, without altering interactions with 
inositol phospholipids**. In addition, SAXS analysis of an assembly- 
deficient dynamin indicates that the CNM mutants have a different 
conformation compared to wild type**””. In our lattice, three PH domains 
related by crystallographic symmetry lie close to interface 3 and the L1 
loop (Fig. 2d), suggesting an interaction. Thus, the PH may serve to 
regulate access to this key multimerization interface to couple dynamin 
membrane interactions to dynamin assembly. Phosphoinositide binding 
by the PH variable loops and penetration of the membrane by variable 
loop 1% could help to expose dynamin interface 3 and/or L1 and thus 
promote multimerization. 

Alignment of the dynamin PH and corresponding sequences from 
Drp1s shows conservation of key residues (Supplementary Figs 1 and 
5b), indicating that lipids may similarly regulate mammalian mito- 
chondrial division. However, mammalian Drpls lack most of the 
C-terminal PH o-helix, including residues mutated in CNM. The 


a 20A 2.0A 


else 
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details of Drp1 interface 3 regulation may therefore be different. 
Consistently, Drp1 splice variants have deletions in this region, point- 
ing to a potential regulatory role. By contrast, Danio rerio Drp1 and 
fungal Dnm1 have an insert B of unknown structure in this region. 
Mitochondrial division DRPs can therefore be subdivided on this 
basis, which correlates with the divergence of their corresponding 
effectors, yeast Mdv1 and mammalian Mff**. 


Discussion 


Our structure provides insight into the mechanism and regulation 
of dynamin assembly and into how the dynamin GTPase cycle is 
harnessed for function. Several observations point to a key role of 
the dynamin BSE in the formation of the GIPase-GTPase dimer 
interface observed in the GTPase-Cgpp GDP*AIF, crystal struc- 
ture’®. This interface is likely to form from adjacent rungs of a dynamin 
helix and is critical to dynamin function as it mediates assembly- 
stimulated GTP hydrolysis. Comparison of the GTPase-GTPase 
dimer with our nucleotide-free structure indicates that in addition to 
expected differences, the BSE is flexible. Dynamin genetic data also 
support a role for the BSE in the regulation of GI Pase-GTPase inter- 
face formation. Specifically, the dynamin switch 2 shibire ts2 mutation, 
G146S, which causes endocytic intermediates with ‘collared’ dynamin 
necks to accumulate”®, is suppressed by the sushi mutation A738T*”, 
located in the Cgrp peptide, facing the hydrophobic groove (Fig. 5a). 
Insight into how dynamin undergoes conformational changes also 
comes from distantly related DRP structures (Supplementary Fig. 2). 
GTP-dependent GTPase domain dimerization is also observed for 
guanylate binding protein, indicating that this may be a common 
feature of the DRP superfamily” (Supplementary Fig. 2c). In addition, 
two recent structures of the atlastin 1 cytosolic domain, thought to 
represent pre- and post-endoplasmic reticulum membrane fusion con- 
formations*’, indicate that large changes occur in the position of the 
3-helix bundle ‘middle domain’ relative to the GTPase domain 
(Supplementary Fig. 2b). 

We propose that GTP binding and self-assembly promote dynamin 
GTPase-GTPase dimer formation via an opening of the BSE relative 
to the GTPase domain. In support, a modified version of our structure 
can be fit into the cryo-electron microscopic reconstruction of 
GMPPCP-bound dynamin with the BSE in a substantially more open 
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Figure 4 | Oligomerization of dynamin into helical structures. a, Dynamin 
helices derived from the linear arrangement in our crystal structure. Two stalk 
dimers (green and magenta) that engage in interface 1 and 3 are related by 
crystallographic translation. Experimentally determined helical parameters for 
dynamin assembled into helices in the GMPPCP-bound state” were matched 
by applying a small shift and tilt of one stalk dimer with respect to the other. 
b, Placement of oligomerized dynamin model into the electron microscopy 
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density map contoured at 1.20. In side view: the fit of the GTPase domain as a 
GTPase-GTPase dimer with the BSE in open conformation to connect to 
interface 1 of the stalk helix (solid density is contoured at 3.60). c, Observed 
conformational flexibility of the BSE. Model fitted into the helical 
reconstruction is shown as black superimposed ribbon on the crystal structure 
of the GTPase-Cggp fusion dimer (PDB accession no. 2X2E)"”. 
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Figure 5 | Model for dynamin GTP cycle conformational changes. 

a, Mapping of dynamin shibire and sushi mutations. b, Nucleotide-dependent 
dynamin conformations. The GTPase core domains (red) are in the same 
orientation. Left, GTP-bound state with open BSE conformation of dynamin as 
fitted into the GMPPCP-bound electron microscopic reconstruction shown in 
Fig. 4. Right, transition state of dynamin obtained by superposition of the BSE 
residues 291-312 and 727-743 of our structure on the corresponding residues 
of the GDP*AIF, -bound GTPase-Cggp fusion dimer (PDB accession no. 
2X2E)"°. Transition from open to closed BSE conformation results in 
movement of stalk domains. c, Model for Dyn1 APRD GTP-bound helix. The 
BSE is opened to allow GIPase-GTPase dimer formation. d, GTP hydrolysis 
closes the BSE and adopts the conformation of the GDP*AIF, -bound 
transition state. This results in a substantial global constriction of the helical 
oligomeric assembly causing membrane deformation and scission. e, Schematic 
of how the proposed GTP hydrolysis triggered BSE conformational change is 
transmitted to oligomerized stalk domains. 


conformation (Fig. 4b). We obtained a good fit of the GTPase- 
GTPase dimer domains and stalk interface 1 using constraints dictated 
by the helical fit of the oligomeric stalk and the strong GTPase domain- 
derived density (Fig. 4b). The BSE a-helices fit into the density stretch- 
ing from the GTPase domain to the oligomeric stalk, indicating that a 
BSE flipping-out motion occurs at two hinge regions: the «-helical kink 
at P294 and in the loop connecting the Nerpase and the GTPase 
domain, at P32 (Fig. 4c). Consistently, superposition of GDP*AIF,— 
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GTPase-Cggp fusion GTPase-GTPase dimer partners indicates a 
smaller yet directionally equivalent opening of the BSE in one proto- 
mer’®. This proposed conformational change is feasible as there are 
relatively few contacts that hold the Nerpase helix to the body of the 
GTPase domain in our structure. In addition, the BSE links the GTPase 
domain to the stalk and interface 1 via a short turn, making it well 
placed to transmit conformational alterations. As previously noted, 
comparison of the nucleotide-free and GMPPCP-bound electron 
microscopic reconstructions indicates that the GMPPCP-bound helix 
has a relatively smaller diameter’’. In addition, the stalk density is more 
‘kinked’ in the GMPPCP-bound form!’”. When we dock our structure, 
we observed that the bases of the stalks do not fit the density of the 
GMPPCP-bound helix. However, where the fit becomes poor there are 
strongly conserved prolines in «2 and «4 and a partially conserved 
proline in «1° clustered in the stalk. We propose that these proline 
residues facilitate the formation of a kink at the stalk base in the 
GMPPCP-bound helix, which might allow interface 1 to form more 
fully in the assembled helix. 

As predicted by our model, GTP hydrolysis would induce closure of 
the BSE via a transition state represented by the structure of the 
GTPase domain-BSE fusion in the presence of GDPeAIF, , towards 
the conformation observed in our nucleotide-free structure (Fig. 5b). 
As GTP binding to dynamin is not rate limiting and GTP hydrolysis is 
stimulated by the formation of the GTPase-GTPase interface between 
adjacent rungs, it is likely that the BSE-dependent conformational 
change occurs in the context of short dynamin helical assemblies rather 
than within a helix consisting of many rungs**. Short assemblies could 
also result in an approximate temporally coordinated conformational 
change (Fig. 5c-e). The conformational changes we propose would 
cause disruption to the assembled helix and the underlying membrane 
via local rung shifts. Interface 1 in our structure, which buries a rela- 
tively small surface area in the assembled stalk lattice, will be especially 
susceptible to change by inter-rung GITPase-GT Pase dimer formation 
and its subsequent disruption by GTP hydrolysis, and these changes 
will be transmitted to interface 3 and the PH. The combined effects of 
curvature stress imposed by a short dynamin helical assembly coupled 
with PH insertion into the membrane is likely to destabilize the mem- 
brane and result in membrane fission’*. Given the strong similarities 
between dynamins and other DRP family members, the structure of 
dynamin and our proposed model will serve to guide studies on the 
mechanisms of action of DRPs in diverse cellular functions. 


METHODS SUMMARY 


The conserved Dynl APRD assembly-defective mutant G397D was identified by 
a cytological assay using the yeast mitochondrial fission DRP Dnm1-GFP. Dyn1 
APRD and Dyn1 G397D APRD were expressed in E. coli and were purified as 
described in Methods. Light scattering, sucrose density gradient centrifugation, 
mass determination, continuous GTPase assays and electron microscopy were 
performed as described in Methods. Crystals were grown by microbatch from 
3.2 ul droplets containing 52.5mM Tris/Cl pH 7.7, 175mM NaCl, 32.5mM 
NaNO3, 20% v/v PEG 400, 0.97mM f-mercaptoethanol and 31.914M Dynl 
G397D APRD and cryo-protected with Paretone-N. Reflection data were col- 
lected at 100K at Beamline 8.3.1 at the Advanced Light Source (Berkeley, 
California, USA) at a wavelength of 0.9488 A. Data collection and processing 
are described in Methods. The structure was determined by molecular replace- 
ment, using known structures of the nucleotide-free rat dynamin 1 GTPase 
domain (PDB accession no. 2AKA chain B), the human dynamin 1 PH domain 
(PDB accession no. 2DYN chain B) anda truncated form of the human MxA stalk 
(PDB accession no. 3LJB chain B) as sequential search models. Structure refine- 
ment is described in Methods. The stalk interface 2 dimer was fit into the previ- 
ously described cryo-electron microscopic reconstruction of GMPPCP-bound 
dynamin by applying tilt and twist to the sequential dimers from the linear 
filaments observed in the crystal to match the helical parameters described for 
the reconstruction. The previously described GTPase domain dimer formed in 
the presence of GDP*AIF, was subsequently fit into the density. Connection of 
the GTPase domain to the stalk required a conformational rearrangement of the 
BSE, which was independently fit into the visible density. The fit was subjected to 
rigid body refinement as described in Methods. 
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Detection of prokaryotic mRNA 
signifies microbial viability and 
promotes immunity 


Leif E. Sander, Michael J. Davis, Mark V. Boekschoten, 
Derk Amsen, Christopher C. Dascher, Bernard Ryffel, 
Joel A. Swanson, Michael Miiller & J. Magarian Blander 


Nature 474, 385-389 (2011). 


In Fig. 1d of this Letter, the labels HKEC and EC were swapped in the 
print version. The lane labelled HKEC should be labelled EC and the 
lane labelled EC should be labelled HKEC. The error has been 
corrected online in the HTML and PDF versions. 


00 MONTH 2011 | VOL 000 | NATURE | 1 
©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


doi:10.1038/nature10555 


Spontaneous epigenetic variation in the Arabidopsis 


thaliana methylome 


Claude Becker!™, Jorg Hagmann'*, Jonas Miiller', Daniel Koenig", 


Heritable epigenetic polymorphisms, such as differential cytosine 
methylation, can underlie phenotypic variation’. Moreover, wild 
strains of the plant Arabidopsis thaliana differ in many epialleles**, 
and these can influence the expression of nearby genes'”. However, 
to understand their role in evolution’, it is imperative to ascertain 
the emergence rate and stability of epialleles, including those that 
are not due to structural variation. We have compared genome- 
wide DNA methylation among 10 A. thaliana lines, derived 30 
generations ago from a common ancestor®. Epimutations at indi- 
vidual positions were easily detected, and close to 30,000 cytosines 
in each strain were differentially methylated. In contrast, larger 
regions of contiguous methylation were much more stable, and 
the frequency of changes was in the same low range as that of 
DNA mutations’. Like individual positions, the same regions were 
often affected by differential methylation in independent lines, 
with evidence for recurrent cycles of forward and reverse muta- 
tions. Transposable elements and short interfering RNAs have 
been causally linked to DNA methylation®. In agreement, differ- 
entially methylated sites were farther from transposable elements 
and showed less association with short interfering RNA expression 
than invariant positions. The biased distribution and frequent 
reversion of epimutations have important implications for the 
potential contribution of sequence-independent epialleles to plant 
evolution. 

Although there is no doubt that DNA sequence mutations are the 
primary raw material for evolutionary change, local DNA methylation 
variants with major effects on the expression of nearby genes can be 
inherited over many generations”. However, such epialleles are not 
always as stable as the primary DNA sequence**""’. New sequencing 
technologies have recently,enabled the)direct determination of spon- 
taneous DNA mutation‘ates”*, and we have previously reported that A. 
thaliana experiences about one single-base-pair mutation per haploid 
genome and generation’. This analysis was based on a set of five muta- 
tion accumulation lines thathad been derived from a single individual 
of the inbred strain, usedsto produce the high-quality reference genome 
sequence for A. thaliana. These lines had been separately propagated in 
a common environment by single-seed descent for 30 generations®. We 
examined whole-genome cytosine methylation'*"* in these five lines 
plus five additional lines of the same population by Illumina sequen- 
cing. We interrogated two siblings each of the 31st generation with an 
average strand-specific coverage depth of 20X per individual; changes 
shared within a line should predominantly reflect differences that had 
accumulated by the 30th generation. Because seeds from the founders 
were no longer available, we compared the 31st generation individuals 
to two independent lines that had been propagated for only three 
generations (Supplementary Fig. 1). 

Out of all cytosine residues with high-quality sequencing support 
(see Supplementary Methods), on average 2.8 million were found to be 
methylated in each line (Supplementary Table 1). The higher genome- 
wide methylation rate in our analysis compared to previous studies'*"* 


Oliver Stegle”, Karsten Borgwardt” & Detlef Weigel’ 


reflects the greater statistical power afforded by increased sequencing 
depth. We subsequently evaluated 13.9 million cytosines that had at 
least threefold coverage in all individuals, of.which 3 million were 
methylated in at least one strain. Using Fisher’s exact test, we identified 
about 186,000 (6.2%) positions with a significant change in methyla- 
tion (false discovery rate <0.05) between at least one 31st generation 
and both 3rd generation lines. Almostall, 99.6%, of these differentially 
methylated positions (DMPs) were alsoydetected with an entropy- 
based method"». Given the limited statistical power for weakly methy- 
lated or poorly covered sites (Supplementary Fig. 2-4), our DMP 
estimate would, almost, certainly increase with higher sequencing 
depth. For further,analyses, we considered sites that agreed between 
31st generation, siblings (on average, 99.8%) and between the two 
strains closest to the founder generation (99.7%). 

CG sites were highly over-represented among DMPs (Fig. 1a). This 
is unlikely to reflect greater instability of CG compared to CHG and 
CHH positions (where H is A, T or C), but rather higher statistical 
power in detecting a change at CG sites, which are on average much 
more highly methylated’*'* (Supplementary Fig. 4). Among CG sites 
ingenic regions, including those producing non-coding RNAs, relative 
abundance of DMPs was two- to fourfold higher compared with non- 
differentially methylated positions (N-DMPs). The opposite was the 
case for CG positions in transposable elements and intergenic regions, 
with a similar, but less pronounced, bias for CHG and CHH sites 
(Fig. 1b). These observations were in agreement with CG-DMPs being 
found most often on chromosome arms, which have the highest gene 
density (Fig. 1c), even though cytosine methylation near the centro- 
meres is the highest'*'*. Gene body methylation gradually increases 
towards the 3’ end, before sharply decreasing at the end of the last 
exon’*"*'®17, although genes 1 kb or less in length were generally only 
weakly methylated (Supplementary Fig. 5a). The profiles of DMPs and 
N-DMPs were similar across individual genes, exons, introns and 
transposable elements (Fig. 1d and Supplementary Fig. 5b, c), but 
DMPs were less frequent in promoter and downstream regions. 
Notably, CG-DMPs accounted for 42% of methylated sites in gene 
bodies, despite all CG-N-DMPs outnumbering CG-DMPs four to one. 

Twenty-four-nucleotide-long small interfering RNAs (siRNAs) are 
important in maintaining DNA methylation’, and N-DMPs coincided 
seven times more often than DMPs with sites to which 24-nt siRNAs 
mapped'*. N-DMPs were also on average only half as far from such 
sites as DMPs (P<2.2 x 10 '°) (Fig. le and Supplementary Fig. 6a). 
siRNAs are enriched in and around transposable elements’’. In agree- 
ment, the average distance to the closest transposable element was 
much shorter for N-DMPs outside of transposable elements, com- 
pared to DMPs (P<2.2X 10 '°), even when only considering 
those in the centromere-distant regions of each chromosome, which 
contain relatively few transposable elements (Fig. le and Supplemen- 
tary Fig. 6b, c). 

A first major insight from our analyses is that transgenerational 
maintenance of CG methylation in transposable elements is apparently 
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Figure 1 | Genome-wide distribution of methylation polymorphisms. 

a, Contribution of CG, CHH and CHG sites to total and differential cytosine 
methylation. 32.8% of all CG, 15.7% of CHG and 4.6% of CHH sites, adding up 
to 10.8% of all cytosines, showed evidence of methylation. b, Distribution of 
DMPs and N-DMPs according to local annotation. CDS, coding sequence; 
ncRNA, non-coding RNA; TE, transposable element. c, Distribution of CG-N- 
DMPs and CG-DMPs along each chromosome. Data were normalized to the 


much more stable than CG methylation of protein-coding genes, con- 
sistent with DNA methylation being more important for controlling 
the activity of transposable element compared tothe latter**%°*". This 
also agrees with a report that genic methylation is much more variable 
between wild strains of A. thaliana than methylation of transposable 
elements’. 

Hierarchical clustering based on DMPs grouped siblings as well as 
3rd and 31st generation lines together. An arbitrary selection of methy- 
lated positions, which included.about 6% DMPs, produced a similar 
pattern; however, with N-DMPs only, clusters became much more 
random (Fig. 2a). These observations indicate that our DMPs capture 
most of the methylation differences between lines. We next calculated 
the pairwise distance between’ strains based on DMPs (Fig. 2b). 
Correlation was highest between the two 3rd generation strains, and 
each individual of the 31st generation was more similar to these two 
lines, from which they were separated by 34 generations, than to the 
other lines from the 31st generation, from which they had diverged 
for 62 generations. Taken together, we conclude that whole-genome 
methylation patterns are largely stable and therefore heritable in 
A. thaliana, but that differences in methylation status accumulate 
gradually, similar to genetic mutations. 

One strain, 69, was exceptional and had 40% more DMPs in com- 
parison with the 3rd generation than the other 31st generation lines 
(Fig. 2b). To determine whether this strain might have a defect in the 
methylation machinery, we sequenced its genome with more and longer 
reads compared to our previous analysis’. We found a non-synonymous 
change in MATERNAL EFFECT EMBRYO ARREST 57 (MEE57), which 
encodes a protein related to METHYLTRANSFERASE 1 (MET1) 
(Supplementary Fig. 7). MEE57 has been reported as essential for endo- 
sperm development”, although several A. thaliana strains lack func- 
tional MEE57 copies”. Thus, whether the MEE57 mutation contributes 
to the increased DMP number in line 69 remains unclear. The fact that 


2 | NATURE | VOL 000 | 00 MONTH 2011 


highest value foreach chromosome and class. d, Averaged distribution of all 
methylated sites (5mC) and methylated CG and CHG sites along genes. Data 
Were normalized to the highest value for each sequence context and class. The 
coding region is indicated by a black bar. e, Distance of DMPs and N-DMPs to 
the closest upstream and downstream 24-nucleotide siRNA and transposable 
element. Horizontal bar corresponds to median, whiskers indicate entire 75th 
percentile. 


the siblings of this line were as similar to each other as other sibling 
pairs (Fig. 2a) argues against a generally increased epimutation rate. 

Compared to genetic mutations, the frequency of epimutations at 
single cytosine residues was many orders of magnitude higher, with an 
average of close to 30,000 DMPs in the analysed sequence space, 
compared with less than 30 DNA sequence mutations per strain’. 
Thirty-two per cent of DMPs between generations 3 and 31 occurred 
more than once, and 13% more than twice (Fig. 2c). If DMPs arose 
randomly, we would expect less than 1% of recurrent events. That we 
observe many more indicates that certain positions are particularly 
prone to increases or decreases in methylation rate. To investigate 
directly how many DMPs emerge from one generation to the next, 
we analysed the 32nd generation of lines 39 and 49. These individuals 
were progeny of siblings of the individuals interrogated in the 31st 
generation, and shared changes in the 32nd generation should reflect 
differences that arose between the 30th and 31st generation. We found 
on average over 3,300 between-generation DMPs. This is in the same 
range as DMPs between siblings (on average, about 5,000), but more 
than we would have expected from the 30,000 that had accumulated 
between the 3rd and each of the 31st generation lines. One explanation 
is that frequent transgenerational changes in methylation status occur 
at a limited number of sites, and that only a fraction of new DMPs is 
maintained over the longer term. This is corroborated by the obser- 
vation that more than two-thirds of DMPs distinguishing the 32nd 
from the 31st generation in lines 39 and 49 had already been found in 
other 31st generation individuals. 

DNA methylation is known to occur nonrandomly, and to cluster in 
specific segments of the genome*’”**. We identified 249 differentially 
methylated regions (DMRs) that were at least 50 bp long (median 
100 bp, maximum 650 bp) (Supplementary Table 2 and Supplemen- 
tary Methods). Although probably a conservative estimate, the number 
of DMRs per line is in the same range as the DNA sequence mutations, 
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Figure 2 | Epigenetic diversity in the analysed population. a, Hierarchical 
clustering based on 20,000 sites each, drawn randomly from DMPs identified in 
pairwise comparison between all strains, cytosines methylated in at least one of 
the analysed strains (including about 6% DMPs), or N-DMPs. b,; Heat map 
representing pairwise Pearson’s correlation coefficient (PCC) between 


less than 30 per line’. As with CG-DMPs, DMRs preferentially loca- 
lized to genes (Fig. 3a). DMRs did not overlap with known DNA 
mutations in these strains’. Similarly, structural variant,discoyery with 
established methods**** did not reveal evidence for DMRs being due to 
gross DNA lesions. The frequency of DMRs along genes was similar to 
the overall distribution of methylated cytosines, and was reminiscent of 
the pattern of variation seen in wild strains of A. thaliana’ (Supplemen- 
tary Fig. 8). There were almost,ten timesjas many DMRs in exons as in 
introns (Fig. 3a). Because exon*specific methylation may influence 
RNA splicing patterns**”’, this could also be a source of variation in 
gene activity. Hierarchical clustering according to DMRs separated 
early- and late-generation strains into distinct groups (Fig. 3b). 
Notably, if we consider the methylation status in the 3rd generation 
individuals as largely reflecting the ancestral pattern, similar fractions of 
DMBs had lost or gained methylation by the 31st generation (Fig. 3c). 

Similarly to DMPs, recurrent events constituted more than one- 
third of all DMRs, indicating that the affected genomic regions were 
privileged sites of change (Fig. 3d, e). In addition, comparison of 
generations 32 and 31 identified four short DMRs per line, with re- 
methylation of one segment that had become unmethylated in 
generation 31 (Supplementary Fig. 9). Together, these observations 
demonstrate that large changes in methylation, although rare, can 
occur even within a single generation. 

Differences in promoter and genic DNA methylation can affect 
RNA levels'®. We compared the transcriptomes of two randomly 
selected strains of the 31st generation with the 3rd generation strains 
by RNA-seq (Supplementary Fig. 10) and identified 320 differentially 
expressed genes in pairwise comparisons between strains (Fig. 3f and 
Supplementary Table 3). The two 31st generation lines were separated 
from each other by the most changes, and the two 3rd generation lines 
by the fewest. Seven differentially expressed genes overlapped with a 
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individuals, considering all 250,000 DMPs identified between all strains. PCCs 
between,3rd generation strains, 0.92; between 3rd and 31st generation, 0.63- 
0:77; between 31st generation lines, 0.52-0.66. The histogram on top of the 
colour key indicates counts of PCC bins. c, Epiallele frequency of DMPs in the 
31st generation. 


DMR in these strains (Fig. 3g). For the three genes with the highest 
difference in expression level and overlapping with the most cons- 
picuous DMRs, we observed a negative correlation between DNA 
methylation and gene expression. The remaining four genes over- 
lapped with much shorter DMRs and no correlation was apparent 
(Fig. 3e, g and Supplementary Fig. 11). 

We have presented a high-resolution analysis of transgenerational 
variation in DNA methylation of A. thaliana. The molecular mechan- 
isms underlying these changes remain elusive, but siRNAs, which map 
very often in or near transposable elements’’, probably have a role in 
stabilizing DNA methylation, which is corroborated by our finding 
that DMPs tend to be farther from transposable elements and to be 
associated with lower local siRNA activity than N-DMPs. These obser- 
vations indicate that the density and distribution of transposable ele- 
ments, which can differ greatly even between closely related species”, 
affect epigenetic variation throughout the genome. In the material 
analysed here, there was no evidence for DNA mutations acting in 
cis as an important cause of DMRs, although we cannot rule out that a 
non-synonymous mutation ina METI homologue might contribute to 
increased variation in DNA methylation in trans in one of the lines. 

In contrast to the high frequency of single-nucleotide methylation 
polymorphisms, larger regions appear to change methylation status at 
a rate that is comparable to genetic mutations. On the basis of previous 
work’, it is conceivable that the emergence of DMRs requires specific 
structural features such as nearby repeats. Although DMRs are rare, we 
found evidence for DMRs affecting gene expression, indicating that 
natural, sequence-independent epialleles could potentially contribute 
to phenotypic diversity. There are subtle morphological differences 
between the mutation accumulation lines®, and quantitative genetic 
approaches could be used to link specific DNA mutations or DMRs 
with such traits. How many of the methylation differences found 
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between wild strains* are due to sequence-independent changes versus 
ones driven by transposable elements and other structural variants is 
an important area for further investigation. In addition, it will be 
necessary to follow DNA methylation not only under benign green- 
house conditions, but also in the much more variable and stressful 
natural environment. 

Perhaps our most important finding is that the number of epimuta- 
tions does not increase linearly with time, indicating that many are not 
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Figure 3 | Differentially methylated regions (DMRs). a, Distribution of 
DMRs according to local annotation. Inter, intergenic; pseudo, pseudogene. 
b, Hierarchical clustering of individuals from the 3rd and 31st generation based 
on methylated sites in DMRs, ranked according to their position in the genome. 
Note the shared methylation differences across lines, but strict pairing of 
siblings. c, Regions with losses and gains of methylation in 1, 2 or 3 strains in 
generation 31 compared to the 3rd generation strains. d, Epiallele frequency of 
DMRs. e, DMR at At3g01345 (Chr3:129,159-130,670) across all strains. 
Methylation on both strands is indicated for each strain. Colours indicate 
methylated reads (red, CG; blue, CHG; yellow, CHH). Grey indicates reads 
supporting non-methylation. For simplicity, only one sibling is shown per 
strain. f, Differentially expressed genes in comparisons between 3rd and 31st 
generation strains. g, Average RNA expression levels of the genes overlapping 
with the regions in e and in Supplementary Fig. 11. Dots indicate values of 
individual samples. 


stably inherited over the long term. In additionto DMPs and DMRs 
that arose apparently independently in.several strains, we even dis- 
covered a DMR that had become demethylated after.31 generations, 
but was re-methylated in the following generation. This suggests that 
DNA methylation in specific regions of the genome can fluctuate over 
relatively short timescales¢ Such sites cam»be considered as going 
through recurrent cycles of forward and reverse epimutation, which 
is very different from what is found at the level of the genome sequence, 
where reverse mutations are exceedingly rare. Importantly, reversion 
rates directly determine the ability of any type of allele to be subject to 
Darwinian,selection. This needs to be taken into account when con- 
sidering the potential of epialleles as a factor in evolution’. 


METHODS SUMMARY 


Methylome sequencing. DNA was prepared from nuclei isolated from leaf tissue, 
bisulphite treated using a modification of a published protocol", and paired-end 
sequenced on the Illumina GAIIx platform. After image analysis and base calling 
with the Illumina pipeline, reads were processed using SHORE”, and aligned to 
the Col-0 reference genome with GenomeMapper”, adapted to the analysis of 
bisulphite sequencing data. Bisulphite conversion rates, as determined from 
unmethylated chloroplast and spiked-in lambda phage DNA, were 99.72% to 
99.84%. 

Analysis of methylated positions. Single sites were classified as methylated or 
unmethylated by fitting a binomial model based on reads falsely reporting methy- 
lation on the unmethylated plastid genome. We only considered cytosine residues 
covered by at least three independent high-quality base calls in all strains. For the 
determination of significant differences in methylation across strains on single 
sites or regions, we used Fisher’s exact test. 

Data visualization. A Gbrowse instance of the methylation profiles is available at 
http://gbrowse.weigelworld.org/fgb2/gbrowse/ath_methyl_ma. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Plant growth and material. Seeds were derived from Arabidopsis thaliana 
Columbia-0 lines in generation 3 (lines 4 and 8), generation 31 (lines 29, 39, 49, 
59, 69, 79, 89, 99, 109 and 119) and generation 32 (lines 39 and 49), counting from 
the founders, as described by Shaw and colleagues® (Supplementary Fig. 1). Plants 
were grown on soil under long-day conditions (23 °C, 16 h light, 8 h dark) after 
seeds had been stratified in 150 nM GA-supplemented water at 4°C for 6 days. 
Siblings were grown independently at different time points. Positions of the pots 
were randomized. 

Nucleic acid extraction. DNA was extracted from rosettes of individual 21-day- 
old plants. Plant material was flash-frozen in liquid nitrogen and ground in a 
mortar. The ground tissue was re-suspended in nuclei extraction buffer (10 mM 
Tris-HCl pH 9.5, 10 mM EDTA, 100mM KCL, 0.5 M sucrose, 0.1 mM spermine, 
0.4mM spermidine, 0.1% B-mercaptoethanol). After cell lysis in nuclei extraction 
buffer with 10% Triton X-100, nuclei were pelleted by centrifugation at 2,000g for 
120s. Genomic DNA was extracted using the Qiagen Plant DNeasy kit (Qiagen). 
Total RNA was extracted from rosette leaves of individual plants using the Trizol 
(Invitrogen) method according to the manufacturer’s instructions. Residual DNA 
was eliminated by DNase I (Thermo Fisher Scientific) treatment. 

Library preparation. Preparation of DNA libraries for genomic sequencing was 
done using the NEBNext DNA Sample Prep Reagent Set 1 (New England Biolabs), 
following the Illumina Genomic Sample Prep Guide (Illumina). 500-1,000 ng 
genomic DNA was fragmented to 300 bp average size with a Covaris S2 instrument 
using the following settings for 120 s in frequency sweeping mode: intensity 5, duty 
cycle 10%, 200 cycles per burst. DNA was purified on Qiaquick PCR purification 
columns. Preparation of DNA libraries for bisulphite sequencing was adapted 
from ref. 14. Input DNA was fragmented as described above. Libraries were con- 
structed using the NEBNext DNA Sample Prep Reagent Set 1 (New England 
Biolabs) according to the Illumina Genomic Sample Prep Guide with the following 
modifications. We used the Illumina Early Access Methylation Adapter Oligo Mix 
(catalogue number ME-100-0010). After size selection, the non-methylated cyto- 
sine residues were converted to uracil using the EpiTect Plus DNA Bisulfite kit 
(Qiagen) according to the manufacturer’s guidelines. For higher conversion effi- 
ciency the bisulphite incubation was repeated. Library enrichment was performed 
with Pfu Cx HotStart Polymerase (Agilent) and 18 PCR cycles. Libraries for RNA 
sequencing were prepared from 4 j1g of total RNA using the Illumina Truseq RNA 
sample prep kit B according to the manufacturer’s protocol. 

Sequencing. All sequencing was performed on an Illumina GAIIx instrument. 
Genomic and bisulphite-converted libraries were sequenced with 2>%101-bp 
paired-end reads. For bisulphite sequencing, conventional A. thaliana DNA geno- 
mic libraries were analysed in control lanes. Transcriptome libraries were 
sequenced with 101-bp single end reads, with three libraries with different index- 
ing adapters pooled in one lane; no control lane was used. For imagejanalysis and 
base calling, we used the Illumina OLB software version 1.8. 

Processing and alignment of bisulphite-treated reads. The SHORE pipeline” 
was used to trim and quality-filter the reads. Its default parameters were applied for 
the filtering step: reads with more than 2 (or 5) bases in the first 12 (or 25) positions 
with a quality score less than3 were discarded. Reads were trimmed to the right- 
most occurrence of two adjacent bases with quality values equal to or greater than 5. 
Trimmed reads shorter than 50 bases were discarded. The remaining high quality 
sequences (on average 82% of raw.readsacross the sequenced strains) were aligned 
against the Arabidopsis thaliana genome sequence version TAIR9 (http://www. 
arabidopsis.org/portals/genAnnotation/gene_structural_annotation/annotation_ 
datajsp) using a modified version of the mapping tool GenomeMapper” that 
supports the alignment, of bisulphite converted reads. Bisulphite converts non- 
methylated cytosines into uracils, which are propagated as adenine-thymine base 
pairs after PCR amplification. GenomeMapper tolerates asymmetrical T-to-C or 
A-to-G mismatches (read base against reference base) and can distinguish between 
reads from the bisulphite-converted strand ofa DNA fragment and sequences from 
its complementary amplified strand, if the reads have been obtained by paired-end 
sequencing. Only the read from the strand with converted Cs is informative about 
the methylation status of the underlying cytosine site. We allowed for up to 10% 
single-base-pair substitutions along the read length in the alignment process for 
each read to retain most coverage. GenomeMapper reports all alignments with the 
least amount of mismatches for each read. However, only reads mapping uniquely 
to a single position were used for this study. Furthermore, all but one read were 
removed from further analysis if their 5’ ends aligned to the same genomic position, 
to account for amplification biases. A paired-end correction method” was used to 
discard repetitive reads by comparing the distance between reads and their partner 
to the average distance between all read pairs. Reads with abnormal distances were 
removed if there was at least one other alignment of this read in a concordant 
distance to its partner. Finally, read counts on all cytosine sites were obtained with 
SHORE. The ‘scoring matrix approach’ of SHORE” assigns a score to each site by 


testing against different sequence and alignment related features. The criteria and 
complete scoring matrix can be found in Supplementary Table 4. For comparisons 
across lines, cytosines were accepted if at most one intermediate penalty on its score 
was applicable to at least one strain (score =32). In this case, the threshold for the 
other strains was lowered, accepting at most one high penalty (score = 15). In this 
way, information from other strains is used to assess sites from the focal strain 
under the assumption of mostly conserved methylation patterns, allowing the 
analysis of additional sites. The methylation statistics on each single strain assumed 
a quality score of 25 or higher, which means no more than two intermediate 
penalties. 

Determination of methylated sites. Sequencing errors, noise and imperfections 
of the bisulphite conversion contribute to the occurrence of sites that appear to be 
weakly methylated. Reads mapping against the non-methylated chloroplast 
sequence allow for objective estimation of the effective background rate of false- 
positive methylation detection. For this purpose, we fitted an independent 
binomial model to the relative proportions of converted and unconverted reads 
that cover cytosines in the chloroplasts. We estimated the binomial rate of false- 
positive methylation from the maximum likelihood estimate, separately for each 
library and for different bins of total read coverage: 


= arg max, IT Binomial(7i,,(n; +7s)|r). 


Here, n, and 7, denote the numberof converted and unconverted reads from 

the considered cytosine sites. Supplementary Fig. 12a shows the obtained back- 
ground methylation rate for a single strain, line 30-39, as a function of the total 
read coverage per site. The overall false methylation rate when combining read 
data across the rangeofread coverage was0.22%, which deviates significantly from 
higher error estimates,when considering low-coverage regions in isolation. To 
account for the variability in errorrates in the downstream analyses, we used 
specific ertor models for each strain and for read-coverage bins of multiples of 
fivefold, yielding error rates between 0.2% and 5.0% (Supplementary Fig. 12b). For 
coverage bins with too few sites for robust statistical estimation (<50), we imputed 
the false methylation rate from the closest sufficiently populated coverage bin. 
Given the estimated rates for false methylation, we carried out a genome-wide test 
for significanfmethylation of cytosines. For each site, we calculated the P value 
under the background model. We then used Storey’s method*', an extension of the 
Benjamini-Hochberg stepdown procedure, to assess genome-wide significance 
using q values. At a joint false discovery rate (FDR) of 5% we found between 
2,316,966 and 3,458,949 methylated sites in each strain (Supplementary Table 
1). When reducing FDR to 0.1%, we still retained almost 85% of the methylated 
sites, showing that the number of sites with weak methylation evidence was low. 
For analysis of methylated sites reported in this study, an FDR of 5% was deemed 
to be acceptable. 
Identification of differentially methylated positions. From the 13.9 million 
cytosines for which we had at least three independent high-quality reads in each 
strain, we selected sites that showed significant methylation in at least one strain, 
resulting in 3,067,017 positions. Sites with statistically significant methylation 
differences were identified with Fisher’s exact test. P values from individual tests 
per site were combined into single P values via conservative Bonferroni correction. 
Genome-wide FDRs were then estimated using Storey’s method’’. To limit false- 
positive DMPs, we first identified 69,583 DMPs between siblings at a relaxed FDR 
of 10%. These sites were excluded as were 8,893 DMPs distinguishing the two 3rd 
generation strains. This left 2,988,541 positions as the final set to test for differ- 
ential methylation between generations. Twenty pairwise tests of each of the ten 
31st generation strains against both 3rd generation strains were conducted on sites 
consistently methylated between 31st generation siblings and in the 3rd genera- 
tion. At an FDR of 5%, this yielded 186,248 DMPs. DMP allele frequency was 
obtained by progressively removing the strain with the lowest q value and correct- 
ing the remaining P values for multiple testing by the methods described above. 

We applied the same strategy to identify DMPs that differed either between the 
31st and 3rd generation, or between 31st generation strains. Count data from 
replicates were combined for each site, followed by pairwise Fisher’s exact tests 
between all combinations of strains (66 tests). We estimated P values for at least 
one differential pair using a Bonferroni correction, followed by Storey’s method"! 
to assess genome-wide significance. At a joint FDR of 5%, this identified 253,546 
DMPs across all 12 strains. 

Assessing statistical power. Two main factors influence the power to detect 
methylation differences: the number of statistical tests and local read coverage. 
To assess the impact of multiple testing, we applied the approach described above 
to all sites with at least threefold coverage in at least 12 of 24 individuals examined. 
Of 25.3 million such positions, 4,547,568 were found to be methylated in at least 
one of the lines, compared to 3,067,017 out of 13.9 million positions when con- 
sidering only sites with complete information. The number of sites assessed as 
methylated thus increased roughly linearly with the number of tested sites, as did 


©2011 Macmillan Publishers Limited. All rights reserved 


the number of differentially methylated positions. Similarly, the fraction of DMPs 
shared in more than one 31st generation strain, ~31%, was very similar to the 
~32% found among sites with complete information. We conclude that our 
method is largely insensitive to the number of tests performed. 

To assess the effect of read coverage, we determined how many DMPs could be 
identified after subsampling at 25%, 50% and 75% of total coverage. We identified 
almost twice as many DMPs with 50% compared to 25% coverage, but only 13% 
additional DMPs were identified when increasing coverage from 75% to 100% 
(Supplementary Fig. 13). Although not yet asymptotic, we estimate that the false- 
negative rate is well below 50%, and most likely closer to 10%. 

Identification of differentially methylated regions. The ~186,000 DMPs dis- 
tinguishing 31st from both 3rd generation lines were consolidated into regions of 
adjacent DMPs for each strain, with a maximum distance of 50 bp between DMPs. 
We then used Fisher’s exact test on the sum of methylated and unmethylated reads 
in both siblings, averaged across positions within the region. Resulting P values 
were corrected with Storey’s method*! and an FDR of 5% was accepted. 
Statistically significant regions from different strains were merged if they over- 
lapped by at least 20% of their combined length and if the methylation change was 
in the same direction compared to the 3rd generation lines. Short regions contain- 
ing only a small number of strongly differential sites were excluded by requiring 
DMRs to have a minimum length of 50 bp and to contain at least ten methylated 
positions and at least five DMPs. 

Mapping of DMPs, N-DMPs and DMRs to genomic elements. We used the 
TAIR10 annotation (http://www.arabidopsis.org/portals/genAnnotation/gene_ 
structural_annotation/ annotation_data.jsp) to determine overlap of genes, pseudo- 
genes and transposable elements with methylated positions. We defined intergenic 
regions as regions that did not correspond to any annotated feature. A DMR was 
considered mapping to a particular genomic element if it overlapped with such an 
element for more than 20% of its length. 

Assessing the distance to the closest siRNA or transposable element. We 
determined the distance between a methylated position and the closest upstream 
and downstream siRNA using a published data set for A. thaliana aerial tissue’ 
(NCBI GEO accession number GSM518432). For transposable elements, we used 
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the TAIR10 annotation. Statistical significance was tested with a two-sided, 
unpaired Student’s t-test on the measured distances. Pericentromeric regions were 
defined as described”. 

Analysis of gene expression. DNA sequences resulting from RNA library pre- 
paration were barcode-sorted and quality filtered in SHORE”, and aligned using 
BWA” to the TAIR1O gene annotation. Reads were filtered for duplicates and 
required to have a mapping quality of at least 37. The remaining mappings were 
used to generate gene-level counts for expression analysis. We only considered 
genes for which the total counts in all samples combined exceeded 30. Between- 
sample expression correlations and strain-distribution plots were used for quality 
control to identify poor samples, and pairwise comparisons of expression were 
performed using the DEseq package* implemented in R. Differentially expressed 
genes were identified by a combination of per-gene variance (P= 0.01, with 
Benjamini and Yekutieli correction®) and common variance (=2X change). 
Density of genes without expression support weresplotted along chromo- 
somes in sliding windows considering only genes with at least 50 methylated or 
unmethylated calls. Windows of gene methylation were calculated for entire gene 
bodies as the fraction of methylated positions (methylated in any sample) divided 
by the total number of called positions. Very similar results were obtained con- 
sidering sites methylated in all samples. Data were visualized with the ggplots 
package in R*. 

Data visualization. A Gbrowse instance of the methylation profiles is available at 
http://gbrowse.weigelworld.org/fgb2/gbrowse/ath_methyl_ma. 
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